Real-time Streaming
Stream audio as it's generated for low-latency applications.
Overview
Streaming allows you to receive audio chunks as they're generated, reducing time-to-first-byte and enabling real-time playback before the full audio is ready.
Streaming supports speed on supported model paths, but pitch is not available on Aura streaming or Rapid Max streaming. Rapid Flash uses progressive pseudo-streaming and can still apply both speed and pitch there.
Streaming is ideal for voice assistants, chatbots, and interactive applications where users expect immediate audio feedback.
Aura models and rapid-max use the provider streaming path. rapid-flash is available in the streaming experience through low-latency pseudo-streaming rather than the upstream provider streaming API.
HTTP Streaming Endpoint
/v1/tts/stream
Python Example
import requests
import pyaudio
# Initialize audio player
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16,
channels=1,
rate=24000,
output=True)
# Request streaming audio
response = requests.post(
"https://yourvoic.com/api/v1/tts/stream",
headers={"X-API-Key": "your_api_key"},
json={
"text": "This audio is being streamed in real-time.",
"voice": "Peter",
"language": "en-US",
"model": "aura-lite" # Fast model for streaming
},
stream=True
)
# Play audio as it arrives
for chunk in response.iter_content(chunk_size=4096):
if chunk:
stream.write(chunk)
stream.stop_stream()
stream.close()
p.terminate()
JavaScript (Browser)
async function streamAudio(text) {
const response = await fetch('https://yourvoic.com/api/v1/tts/stream', {
method: 'POST',
headers: {
'X-API-Key': 'your_api_key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: text,
voice: 'Peter',
language: 'en-US'
})
});
const reader = response.body.getReader();
const audioContext = new AudioContext();
const chunks = [];
while (true) {
const { done, value } = await reader.read();
if (done) break;
chunks.push(value);
}
// Combine chunks and decode
const blob = new Blob(chunks, { type: 'audio/mpeg' });
const arrayBuffer = await blob.arrayBuffer();
const audioBuffer = await audioContext.decodeAudioData(arrayBuffer);
// Play the audio
const source = audioContext.createBufferSource();
source.buffer = audioBuffer;
source.connect(audioContext.destination);
source.start(0);
}
streamAudio('Hello from streaming!');
Best Practices
- Use fast models -
aura-liteorrapid-flashfor the lowest perceived latency - Buffer initial chunks - Buffer 2-3 chunks before playback for smoother experience
- Handle interruptions - Allow users to stop/skip audio mid-stream
- Error recovery - Implement reconnection logic for network drops
Some mobile browsers require user interaction before playing audio. Trigger audio playback from a user action (click/tap).