Sarvam is a high-performance AI provider optimized for Indic (Indian) languages and real-time voice applications.
Agent Voice Response (AVR) integrates Sarvam as a full conversational stack, covering:
This makes Sarvam a strong choice for Indian telephony, IVR replacement, and voicebot deployments.
| Capability | Status |
|---|---|
| Speech-to-Text (ASR) | ✅ |
| Large Language Models (LLM) | ✅ |
| Text-to-Speech (TTS) | ✅ |
| Streaming / Real-time | ✅ |
| Telephony (8kHz PCM) | ✅ |
| Indic Languages | ✅ |
The Sarvam ASR integration uses Sarvam WebSocket APIs to provide real-time transcription, optimized for telephony audio.
POST /speech-to-text-stream
| Parameter | Value |
|---|---|
| Encoding | s16le (PCM 16-bit LE) |
| Sample Rate | 8000 Hz |
| Channels | Mono |
SARVAM_API_KEY=sk_...
SARVAM_WEBSOCKET_URL=wss://api.sarvam.ai/speech-to-text/ws
SARVAM_SPEECH_RECOGNITION_MODEL=saarika:v2.5
SARVAM_SPEECH_RECOGNITION_LANGUAGE=en-IN
PORT=6050
The Sarvam LLM integration connects AVR to Sarvam Chat Completions, providing conversational reasoning optimized for Indic languages.
POST /prompt-stream
{
"messages": [
{ "role": "user", "content": "Hello, how can you help me?" }
]
}
{
"type": "text",
"content": "..."
}
SARVAM_API_KEY=sk_...
SARVAM_MODEL=sarvam-m
SYSTEM_PROMPT="You are a helpful assistant."
PORT=6051
The Sarvam TTS integration generates real-time streamed audio, suitable for telephony and IVR systems.
POST /text-to-speech-stream
{
"text": "Hello, how can I assist you today?"
}
| Parameter | Value |
|---|---|
| Container | WAV |
| Sample Rate | 8000 Hz |
| Encoding | PCM 16-bit |
| Channels | Mono |
SARVAM_API_KEY=sk_...
SARVAM_TTS_LANGUAGE=en-IN
SARVAM_TTS_SPEAKER=aditya
SARVAM_TTS_MODEL=bulbul:v3
SARVAM_TTS_TEMPERATURE=0.6
PORT=6052
Github: https://github.com/agentvoiceresponse/avr-infra/blob/main/docker-compose-sarvam.yml
A complete Sarvam stack for Agent Voice Response:
version: "3.9"
services:
avr-asr-sarvam:
image: agentvoiceresponse/avr-asr-sarvam
container_name: avr-asr-sarvam
env_file: .env
ports:
- "6050:6050"
restart: unless-stopped
avr-llm-sarvam:
image: agentvoiceresponse/avr-llm-sarvam
container_name: avr-llm-sarvam
env_file: .env
ports:
- "6051:6051"
restart: unless-stopped
avr-tts-sarvam:
image: agentvoiceresponse/avr-tts-sarvam
container_name: avr-tts-sarvam
env_file: .env
ports:
- "6052:6052"
restart: unless-stopped
Start the stack:
docker compose up -d
Sarvam is recommended when: