LLM Models

LLM(Large Language Model) is a model that can generate text. It is trained on a large corpus of text and can generate text that is similar to the training corpus.

At Lepton, we provide a list of popular LLM models as model apis for AI developers to use. The models are hosted on our servers and can be accessed through our APIs.

We made it compatiple with OpenAI API so that you can use it as a drop-in replacement by redirecting api_base to each model url spcified below. For api_token, you can use your Lepton API token.

Usage

import os
import openai

client = openai.OpenAI(
    base_url="https://llama2-7b.lepton.run/api/v1/",
    api_key=os.environ.get('LEPTON_API_TOKEN')
)

completion = client.chat.completions.create(
    model="llama2-7b",
    messages=[
        {"role": "user", "content": "say hello"},
    ],
    max_tokens=128,
    stream=True,
)

for chunk in completion:
    if not chunk.choices:
        continue
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="")

Model List

To switch to a different model, simply change the api_base to the model url specified below.

Model NameModel URL
Llama2-7bhttps://llama2-7b.lepton.run/api/v1
Llama2-13bhttps://llama2-13b.lepton.run/api/v1
Llama2-70bhttps://llama2-70b.lepton.run/api/v1
Prompt LLMhttps://prompt-llm.lepton.run/api/v1
Mixtral-8*7bhttps://mixtral-8x7b.lepton.run/api/v1/