Audio Processing Models

For audio processing, we provide AI application developers with WhisperX, the SOTA speech enhancement model for real-world noise reduction. WhisperX is a lightweight model that can be deployed on edge devices. It can be used to improve the performance of speech recognition models in noisy environments.

WhisperX

Usage

from leptonai.client import Client
import base64

c = Client("https://latest-whisperx.cloud.lepton.ai", token="YOUR_LEPTON_API_TOKEN")

result=c.run(
    input="https://datasets-server.huggingface.co/cached-assets/mozilla-foundation/common_voice_11_0/--/en/test/311/audio/audio.mp3"
)


print(result[0]["text"])