Request Parameters

Complete reference of all parameters for the TTS generate endpoint.

Required Parameters

ParameterTypeDescription
text string The text to convert to speech. Maximum 5,000 characters.

Optional Parameters

ParameterTypeDefaultDescription
voice string "Peter" Voice name. Use /voices endpoint to get available voices.
language string "en-US" Language code (e.g., "en-US", "hi", "ja-JP").
model string "aura-prime" TTS model: aura-lite, aura-prime, aura-max, rapid-flash, rapid-max
speed float 1.0 Playback speed. Range: 0.5 to 2.0
pitch float 1.0 Voice pitch. Range: 0.5 to 2.0
format string "mp3" Audio format: mp3, wav

Parameter Details

text

  • Maximum 5,000 characters per request
  • Supports expression tags: <laugh>, <sigh>, etc. (Aura models only)

voice

Voice names are localized by language. The same underlying voice has different names:

  • English: Peter, Emma, James, Olivia...
  • Hindi: Rahul, Deepika, Vikram, Priya...
  • Japanese: Takeshi, Yuki, Kenji, Sakura...

model

ModelSpeedQualityExpressions
aura-liteFastGoodYes
aura-primeMediumExcellentYes
aura-maxSlowPremiumYes
rapid-flashFastestGoodNo
rapid-maxFastVery GoodNo

format

FormatMIME TypeNotes
mp3audio/mpegDefault, good compression
wavaudio/wavUncompressed, best for editing

Example Requests

Minimal Request

{
    "text": "Hello, world!"
}

Full Request

{
    "text": "Premium quality audio for professional content.",
    "voice": "Peter",
    "language": "en-US",
    "model": "aura-max",
    "speed": 0.9,
    "pitch": 1.0,
    "format": "wav"
}

With Expressions

{
    "text": "That's hilarious! <laugh> I can't believe it!",
    "voice": "Peter",
    "language": "en-US",
    "model": "aura-prime"
}