File Storage

Each photon deployment comes with ephemeral storage, but in many cases, we need a more persistent and sharable storage. Use cases include:

  • Keeping model checkpoint and other related data cached.
  • Cacheing and sharing stateful data between replicas or between runs.
  • Storing assets that do not belong to model themselves, but are needed for applications.

Lepton provides a persistent file storage for each workspace. Such storage could be mounted as a POSIX filesystem within the deployment, and you can use the dashboard to manage the files and folders within the file storage.

Upload files

To upload files, you can use Web UI, CLI, or Upload from Cloud to do so. Here are the details for each method.

Upload files with Web UI

You can upload files to the storage via the web UI. On the File System page, click Upload File then choose From local will open a file picker for you to select the file you want to upload.

Upload files with CLI

$ lep storage upload /local/path/to/a.txt /a.txt

If you are uploading a folder or large files(over 1 GBs), you need to use --rsync flag to do so. rsync is only available for Standard and Enterprise workspace plans. Here is an example:

# upload a folder
$ lep storage upload -r /local/path/to/folder /folder --rsync
# upload a large file, such as a model checkpoint, add --progress to show the progress
$ lep storage upload /local/path/to/a.safetensors /a.safetensors --rsync --progress

For more details, check out the CLI documentation on storage.

Upload files from Cloud

You can also upload files from cloud storage services such as AWS S3 and Cloudflare R2. On the File System page, click Upload File then choose From Cloud will open a dialog for you to select the cloud storage service and fill in the required information.

For AWS S3, you need to provide the following information:

  • Bucket Name : The name of the bucket you want to upload from.
  • Access Key ID : The access key of the AWS account.
  • Secret Access Key : The secret access key of the AWS account.
  • Destination Path : The path in the file system you want to upload to.

For Cloudflare R2, you need to provide the following information:

  • Endpoint URL : The S3 API URL of the Cloudflare R2 bucket. It can be found in the bucket's settings page uder Bucket Details. Do not include the bucket name in the URL. It should look like https://xxxxxxxxx.r2.cloudflarestorage.com.
  • Bucket Name : The name of the bucket you want to upload from.
  • Access Key ID : The access key of the Cloudflare R2 API Token. You can manage and create R2 tokens by clickin the Manage R2 API Tokens button in the bucket's settings page.
  • Secret Access Key : The secret access key of the Cloudflare R2 API Token.
  • Destination Path : The path in the file system you want to upload to.

Mounting storage to a deployment

When you run a photon via the CLI, you could use the --mount flag when executing photon run with a format STORAGE_PATH:MOUNT_PATH.

  • STORAGE_PATH is the path of the storage you want to mount, starting from /.
  • MOUNT_PATH is the path that will show up in the depployment.

For example, to launch a photon named test with the whole storage mounted at /mnt/leptonstore, you could do:

lep photon run -n test --mount /:/mnt/leptonstore

Inside the deployment, you can then access the storage via /mnt/leptonstore as a standard POSIX filesystem.

You can also use the CLI to manage the storage. For more details, check out the CLI documentation on storage.

While it is possible to mount arbitrary existing storage paths to arbitrary mount paths, we recommend mounting the storage to a path under /mnt to avoid conflicts with other files and folders.

Mounting storage via Web UI

If you are creating a deployment through the web UI, you can choose to mount the storage to a specified path during the deployment creation time, as demonstrated below:

You will then be able to access the mounted storage inside the deployment. In the example above, the folder Yuze will be mounted at path /mnt.

In the web terminal of the deployment, you can verify it by going into the deployment details page:

And use ls /mnt to see the files and subfolders:

Manage files via Web UI

Lepton's dashboard allows you to inspect and manage files. Simply click the Storage tab after login to the dashboard. From there, you would be able to see the files currently stored and manage them from there.

Example: a safe counter

You might recall that the counter example in the walkthrough section is not safe, because the counter is stored in-memory and not persistent. With a persistent store, we can use a simple file to store the counter, and make it safe across replicas and restarts:

import errno
import fcntl
import os
import time

from fastapi import HTTPException

from leptonai.photon import Photon


class SafeCounter(Photon):
    PATH = "/mnt/leptonstore/safe_counter.txt"

    def init(self):
        # checks if the folder containing the file exists
        if not os.path.exists(os.path.dirname(self.PATH)):
            raise RuntimeError(
                "SafeCounter requires a Lepton storage to be attached to the deployment"
                "at /mnt/leptonstore."
            )
        # checks if the file exists
        if not os.path.exists(self.PATH):
            # if not, create the file and write 0 to it. Strictly speaking, this
            # may have a race condition, but it is unlikely to happen in practice
            # and the worst that can happen is that the file is created twice,
            # unless a request comes in right in between two deployments creating
            # the file.
            with open(self.PATH, "w") as file:
                file.write("0")

    @Photon.handler("add")
    def add(self, x: int) -> int:
        # Open the file in read mode
        with open(self.PATH, "r+") as file:
            # Attempt to acquire a non-blocking exclusive lock on the file
            retry = 0
            while retry < 10:
                try:
                    fcntl.flock(file, fcntl.LOCK_EX | fcntl.LOCK_NB)
                    break
                except IOError as e:
                    # If the lock cannot be acquired, sleep for a short interval
                    # and try again
                    if e.errno != errno.EAGAIN:
                        raise HTTPException(
                            status_code=500,
                            detail=(
                                "Internal server error: failed to acquire lock on file"
                                " after repeated attempts."
                            ),
                        )
                    retry += 1
                    time.sleep(0.1)

            # Read the current value from the file
            current_value = int(file.read())
            # Increment the value
            new_value = current_value + x
            file.seek(0)
            file.write(str(new_value))
            file.truncate()
            fcntl.flock(file, fcntl.LOCK_UN)
            return new_value

    @Photon.handler("sub")
    def sub(self, x: int) -> int:
        return self.add(-x)

The above counter could be used to safely increment and decrement a counter across replicas and restarts. For example, if we run the above photon with 2 replicas, we could do:

lep photon create -n safe-counter -m safe_counter.py
lep photon push -n safe-counter
lep photon run -n safe-counter --replicas 2 --mount /:/mnt/leptonstore

The full source code could be found at Lepton SDK's leptonai\examples\counter folder.

Of course, there are better ways to implement a safe counter (such as a fully managed key-value store), but the above example demonstrates how to use the storage to store stateful data across replicas and restarts.