Llama 3.3 70B

128K context

Description

Llama 3.3 70B is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks.

Pricing

  • Dedicated Endpoints: Calculated by the instance type and the number of GPUs, you can find the details in pricing page. You can also contact us to reserve GPUs.
  • Serverless Endpoints: $0.8 / M tokens for using Llama 3.3 70B, pay as you go.

Playground

System prompt
Temperature
Max tokens
Top P

API Reference

Lepton AI

© 2025