Quickstart

Let's start by setting up Lepton on your local machine, and run a classical AI model: GPT-2. For this example, we will use model parameter hosted on HuggingFace, and run it with Lepton's built-in photon.

After this quickstart, you will have Lepton fully running on your local dev environment, and can start to deploy your own models to the cloud.

1 Installation and setup

Let's install the python package from PyPI. Assuming you are using pip, do the following:

pip install -U leptonai

The -U options ensure that you are installing the latest version of leptonai. If you want to keep the current version, omit the -U option. If you want to install alpha versions too, add the --pre option.

The main commandline tool is lep. You can check that it is installed by running:

lep --help

It'll show you a list of commands that you can run with lep, but let's ignore these for now and move on to the next step.

2 Build and run locally

Lepton uses the concept of a "photon" to bundle the code to run an AI model, its dependencies, and other miscellaneous contents. Loosely speaking, you can view it as a Docker container, but it's much more lightweighted and tailored for AI. We'll get into the details later. For now, let's create a photon for GPT-2:

lep photon create --name mygpt2 --model hf:gpt2

This creates a photon named mygpt2. We'll go into the details of what a photon is later, but for now, it is readily runnable locally. Let's run it:

lep photon run --name mygpt2 --local

Congrats! You now have gpt2 running on your local machine. Once the model is running, there are multiple ways we can try it out, either via cURL, or directly via Lepton's python client:

# Via cURL
curl -X 'POST' \
  'http://0.0.0.0:8080/run' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "inputs": "Once upon a time"
}'

The running photon also exposes HTML websites for more detailed inspections. More specifically, you may visit http://0.0.0.0:8080/docs to inspect the docs, and make a call via the web UI.

3 Deploy in the cloud

Let's try to run it remotely, so if someone else is using your AI model, you don't need to worry about keeping your laptop on and wifi connected. To do this, let's first log in to the Lepton cloud. If you haven't registered an account, consider joining first. Then, log in with:

lep login

This will start a browser window, and ask you to log in. Once you log in, you'll be redirected to a page that shows the credentials that you'll copy and paste back into the terminal. Once you do that, you'll be logged in.

Once we are successfully logged in, let's push the locally built photon, mygpt2, to your workspace:

lep photon push --name mygpt2

Once the photon is pushed, we can use the lep photon run command to create a deployment, which is a running instance of a photon on the cloud:

lep photon run --name mygpt2 --deployment-name mygpt2

Note that we are not using the --local flag this time. When we are logged in, this tells Lepton to run the photon on the cloud, instead of locally. For the sake of simplicity we named the deployment the same as the photon, but you can name it anything you want.

4 Manage deployments

Once the deployment is created, there are two ways we can check the status of the deployment: via the web UI, or via the command line.

Let's first go through the commandline approach: the related commands are under the lep deployment namespace, with the most relevant ones being list and status: list gives you the currently running deployments in the workspace, and status gives you more detailed status of a specific deployment. Let's try them out:

lep deployment list
lep deployment status --name mygpt2

Let's check out the web UI. Login to the dashboard to view the deployments. An example below:

We can click the deployment name to see the details. It gives a more feature-rich interface to inspect the deployment than what the CLI offers. For example, we can try it out directly under the Demo tab:

The API section gives us example code that we can run inference remotely. In our case, the python code is shown and readily usable:

The API token is used to access the deployment securely. We can find it under the Settings tab here.

You can also use the python client to access the deployment. For more details, check out the documentation on python client.

5 Clean up

Voila! This is how we can run our first model on Lepton. Let's clean up the deployment and photon just to complete the cycle. To do this, we can either manually click the delete buttons in the web UI, or use the lep deployment delete and lep photon delete commands:

lep deployment remove --name mygpt2
lep photon remove --name mygpt2

Congrats! What's next?

Congrats on running your first GPT example! The quickstart shows you how to run an existing, popular model, and Lepton is designed to run any model, especially your own. To learn more, check out the following resources:

For more advanced topics, check out:

We wish you the best of luck in your AI journey!