Pricing | Lepton AI

Lepton AI

Pricing

The most cost-effective way to build and run AI applications, at scale, and in minutes with a cloud native platform.

Basic

+ usage fee

/ month

Perfect for individuals and small teams to get started.

GET STARTED

No subscription fee
Up to 48 CPUs + 2 GPUs concurrently

Standard

$30

+ usage fee

/ month

Designed for collaborative teams and growing businesses.

Multi-user support for collaboration
Custom runtime environments
Dedicated account manager
Up to 192 CPUs + 16 GPUs concurrently
600 QPM for serverless endpoints

Enterprise

Custom

For organizations requiring high SLAs, performance, and compliance.

Custom integration and support
Self-hosted deployments
Dedicated API support for control plane
Audit log and RBAC
Prioritize your requests on our roadmap

Compare Plans

Compare the three plans to find the best fit for your needs.

	Basic	Standard	Enterprise
Quota limit	48 CPU, 2 GPU, 1 Queue, 1 KV	192 CPU, 16 GPU, 10 Queue, 10 KV	Unlimited
Serverless endpoint rate limit	10 Requests Per Minute	600 Requests Per Minute	Contact Us
Multi-user support
Elevated quota for scaling
Dedicated account manager
Custom integration and support
Self-hosted deployments
Dedicated API support for control plane
Audit log and RBAC

Compute Usage

Pay exclusively for your actual compute usage, billed by the minute.

CPU

1 vCPU, 4 GiB RAM

$0.000825 / min

NVIDIA-A10

1 NVIDIA-A10 GPU,96 GiB RAM, 24 GiB vRAM, 24 vCPU

$0.0202 / min

NVIDIA-A100

1 NVIDIA-A100-80GB GPU,192 GiB RAM, 80 GiB vRAM, 12 vCPU

$0.0535 / min

NVIDIA-A6000

1 NVIDIA-RTX-A6000 GPU,64 GiB RAM, 48 GiB vRAM, 8 vCPU

$0.0275 / min

NVIDIA-H100

1 NVIDIA-H100-80GB-HBM3 GPU,240 GiB RAM, 80 GiB vRAM, 20 vCPU

$0.05 / min

Reserve GPU for dedicated access and priority scheduling →

Built-in Storage

Saving your models, data and logs just right in the platform, and pay only for what you use.

$0.153 / GB / month

Serverless Endpoints

Enhance your application with premium open-source models while leveraging Lepton's exceptional runtime performance and reliability.

Model	Price
Llama3.2 1b	$0.01 / million tokens
Llama3.2 3b	$0.03 / million tokens
Llama3.1 8b	$0.07 / million tokens
Llama2 13b	$0.18 / million tokens
Llama3.3 70b	$0.8 / million tokens

Frequently Ask Questions

Find answers to the most common questions about Lepton AI.

How are compute usages billed?

Can I cancel my subscription at any time?

What kind of support does Lepton offer?

Is there a limit for workspace members number?

Do I need to pay for serverless endpoints usage fee if I upgrade to the standard plan?