Getting Started with the Fastest STT API

Convert audio to text with enterprise-grade accuracy using our Speech to Text API. Features include speaker recognition, real-time streaming, and support for Python and Javascript.

🌍

90+ Languages

Automatic language detection or specify the source language.

⚡

Batch & Real-time

Cipher for batch processing, Lucid for live streaming.

🎯

High Accuracy

Enterprise-grade with word-level timestamps.

🔊

Speaker Diarization

Identify and label different speakers automatically.

🔑

Keywords Boost

Improve recognition of domain-specific terms.

💬

Smart Formatting

Auto-format dates, numbers, and currency.

Choose Your Use Case

Pre-Recorded Audio

Upload audio files for batch transcription. Perfect for podcasts, meetings, interviews, and media content.

Real-time Streaming

Live transcription via WebSocket. Ideal for voice assistants, live captions, and call centers.

Quick Example

Get started with a simple transcription request:

import requests

url = "https://yourvoic.com/api/v1/stt/transcribe"
headers = {"X-API-Key": "your_api_key"}

with open("audio.mp3", "rb") as f:
    response = requests.post(url, headers=headers,
        files={"file": f},
        data={"model": "cipher-fast"})

result = response.json()
print(result["text"])

Available Models

Model	Type	Best For	Credits/sec
`cipher-fast`	Batch	Quick turnaround, general content	2
`cipher-max`	Batch	Complex audio, accents, technical content	2
`lucid-mono`	Both	Single-language content	3
`lucid-multi`	Both	Mixed-language content	3
`lucid-agent`	Both	Conversational, voice agents	3
`lucid-lite`	Both	High volume, cost-effective	3

Next Pre-Recorded Audio →