Keywords Boost
Improve recognition accuracy for specific words, names, or domain-specific terminology.
Overview
Keywords boost helps the model better recognize specific terms that might otherwise be misheard. This is especially useful for:
- Product names and brand names
- Company names and person names
- Technical jargon and acronyms
- Industry-specific vocabulary
How It Works
Speech-to-text models sometimes misrecognize uncommon words:
- Brand names: "YourVoic" → "Your Voice"
- Technical terms: "webhook" → "web hook"
- Acronyms: "API" → "a pie"
By providing a list of keywords, you tell the model: "These exact words might appear - listen for them!"
The model then boosts the probability of recognizing those specific terms when it hears something similar:
- You pass keywords:
"YourVoic,API,webhook" - The STT model adds these to its vocabulary with higher weight
- When audio sounds like "your voice", the model checks: "Could this be 'YourVoic'?"
- If close enough, it transcribes as "YourVoic" instead of "Your Voice"
Supported Models
| Model | Keywords Boost |
|---|---|
lucid-mono | ✅ |
lucid-multi | ✅ |
lucid-agent | ✅ |
lucid-lite | ✅ |
cipher-fast | ❌ (use prompt instead) |
cipher-max | ❌ (use prompt instead) |
Usage
Pass keywords as a comma-separated string:
import requests
response = requests.post(
"https://yourvoic.com/api/v1/stt/lucid/transcribe",
headers={"X-API-Key": "your_api_key"},
files={"file": open("interview.mp3", "rb")},
data={
"model": "lucid-mono",
"keywords": "YourVoic,API,transcription,Lucid,Cipher"
}
)
result = response.json()
print(result["text"])
Real-time Streaming
Keywords can also be used in real-time WebSocket connections:
const params = new URLSearchParams({
api_key: 'YOUR_API_KEY',
model: 'lucid-agent',
keywords: 'YourVoic,API,webhook,streaming'
});
const ws = new WebSocket(`wss://yourvoic.com:8443/api/v1/stt/realtime/stream?${params}`);
Best Practices
- Limit keywords: Use 10-20 keywords maximum for best performance
- Exact spelling: Provide keywords exactly as you want them transcribed
- Include variations: Add common misspellings if relevant
- Case sensitivity: Keywords are case-insensitive
Context Prompts (Cipher Alternative)
For Cipher models, use the prompt parameter instead:
response = requests.post(
"https://yourvoic.com/api/v1/stt/cipher/transcribe",
headers={"X-API-Key": "your_api_key"},
files={"file": open("audio.mp3", "rb")},
data={
"model": "cipher-max",
"prompt": "Domain vocabulary: YourVoic, API, transcription, webhook"
}
)
💡 Tip: For best results, test your keywords with sample audio to ensure they're being recognized correctly.