Models Overview
Choose the right AI model for your text-to-speech needs.
Available Models
YourVoic offers 5 TTS models, each optimized for different use cases:
| Model | Latency | Quality | Languages | Expressions | Control Notes | Credit Cost |
|---|---|---|---|---|---|---|
aura-lite |
Fast | Good | 90+ | Yes | Streaming speed supported; streaming pitch unavailable. Generate controls are conservative in the playground. | 1 credit/char |
aura-prime |
Medium | Excellent | 90+ | Yes | Streaming speed supported; streaming pitch unavailable. | 1 credit/char |
aura-max |
Slower | Premium | 90+ | Yes | Best natural quality; streaming pitch unavailable. | 1 credit/char |
rapid-flash |
Fastest | Good | 18 | No | Supports speed and pitch in standard generation and pseudo-streaming. | 1 credit/char |
rapid-max |
Fast | Very Good | 41 | No | Generate supports speed and pitch; streaming supports speed only. | 2 credits/char |
If your workflow depends on explicit speed or pitch controls, Rapid models are the safest choice. Aura models focus first on natural speech quality and accent guidance, and their streaming path does not support pitch.
Model Details
Aura Lite
Our fastest Aura model, optimized for real-time applications.
- Best for: Chatbots, voice assistants, real-time apps
- Latency: ~200ms first byte
- Features: Expression tags, all languages, all voices
response = requests.post(
"https://yourvoic.com/api/v1/tts/generate",
headers={"X-API-Key": "your_api_key"},
json={
"text": "Quick response for real-time chat.",
"voice": "Peter",
"model": "aura-lite"
}
)
Aura Prime
Balanced model with excellent quality and reasonable speed.
- Best for: General purpose, e-learning, presentations
- Latency: ~400ms first byte
- Features: Expression tags, enhanced prosody
response = requests.post(
"https://yourvoic.com/api/v1/tts/generate",
headers={"X-API-Key": "your_api_key"},
json={
"text": "Professional quality for your content.",
"voice": "Peter",
"model": "aura-prime"
}
)
Aura Max
Premium quality model for professional content.
- Best for: Audiobooks, professional voiceovers, media production
- Latency: ~800ms first byte
- Features: Highest quality, natural prosody, expression tags
response = requests.post(
"https://yourvoic.com/api/v1/tts/generate",
headers={"X-API-Key": "your_api_key"},
json={
"text": "Premium audiobook quality narration.",
"voice": "Peter",
"model": "aura-max"
}
)
Rapid Flash
Ultra-fast model for lowest latency applications.
- Best for: IVR systems, notifications, high-volume processing
- Latency: ~100ms first byte
- Limitation: Only 18 languages supported
Only these 18 languages: Danish, German, English (Australia), English (UK), English (India), English (US), Spanish (Spain), Spanish (US), Filipino, French (Canada), French (France), Hindi, Italian, Japanese, Korean, Portuguese (Brazil), Thai, Vietnamese
response = requests.post(
"https://yourvoic.com/api/v1/tts/generate",
headers={"X-API-Key": "your_api_key"},
json={
"text": "Ultra-fast response.",
"voice": "Peter",
"language": "en-US",
"model": "rapid-flash"
}
)
Rapid Max
Fast model with better quality than Rapid Flash.
- Best for: Balance of speed and quality
- Latency: ~150ms first byte
- Features: All 50+ languages, good quality
response = requests.post(
"https://yourvoic.com/api/v1/tts/generate",
headers={"X-API-Key": "your_api_key"},
json={
"text": "Fast with better quality.",
"voice": "Peter",
"model": "rapid-max"
}
)
Choosing the Right Model
| Use Case | Recommended Model | Why |
|---|---|---|
| Voice Assistant | aura-lite | Fast response, supports expressions |
| Audiobooks | aura-max | Premium quality, natural prosody |
| E-learning | aura-prime | Clear speech, good quality |
| IVR/Phone Systems | rapid-flash | Lowest latency |
| Notifications | rapid-flash | Fast, cost-effective |
| Video Narration | aura-max | Professional quality |
| Chatbots | aura-lite | Real-time with expressions |
| Batch Processing | rapid-max | Good balance for volume |