Llama 3.1 8B
128K context
Description
Llama 3.1 8B is best suited for limited computational power and resources. The model excels at text summarization, text classification, sentiment analysis, and language translation requiring low-latency inferencing.
Pricing
- Dedicated Endpoints: Calculated by the instance type and the number of GPUs, you can find the details in pricing page. You can also contact us to reserve GPUs.
- Serverless Endpoints: $0.07 / M tokens for using Llama 3.1 8B, pay as you go.
Playground
System prompt
Temperature
Max tokens
Top P