The xAI Grok Voice Agent STS integration connects Agent Voice Response to xAI's real-time Voice Agent API. AVR telephony audio is bridged over WebSocket to wss://api.x.ai/v1/realtime, with 8 kHz ↔ 24 kHz PCM resampling handled inside the connector.
This mode provides low-latency speech-to-speech conversations using Grok voice models, with support for AVR function tools and optional built-in xAI tools.
For AVR tool wiring, see AVR Function Calls.
grok-voice-latest, grok-voice-think-fast-1.0, and versioned model nameseve, ara, rex, sal, leointerruption events to AVR coreweb_search, x_search, file_search via environment variablesXAI_API_KEY in your connector environment.Official API reference: https://docs.x.ai/developers/model-capabilities/audio/voice-agent
| Variable | Description | Default | Required |
|---|---|---|---|
XAI_API_KEY |
xAI API key | — | Yes |
PORT |
AVR-facing WebSocket port | 6041 |
Optional |
XAI_MODEL |
Voice model query parameter | grok-voice-latest |
Optional |
XAI_VOICE |
Agent voice (eve, ara, rex, sal, leo) |
eve |
Optional |
XAI_TURN_DETECTION |
Turn detection mode | server_vad |
Optional |
XAI_TURN_DETECTION_THRESHOLD |
VAD threshold (when numeric) | — | Optional |
XAI_TURN_DETECTION_SILENCE_MS |
Silence duration for end-of-turn | — | Optional |
XAI_TURN_DETECTION_PREFIX_PADDING_MS |
Prefix padding for VAD | — | Optional |
XAI_TRANSCRIPTION_MODEL |
Input audio transcription model | whisper-1 |
Optional |
XAI_LANGUAGE |
Transcription language hint (e.g. it) |
— | Optional |
XAI_CONNECT_TIMEOUT_MS |
Upstream xAI WebSocket connect timeout | 15000 |
Optional |
XAI_BUILTIN_TOOLS |
Comma-separated: web_search, x_search, file_search |
— | Optional |
XAI_COLLECTION_IDS |
Collection IDs for file_search |
— | Optional |
AMI_URL |
AVR AMI service for transfer/hangup tools | http://127.0.0.1:6006 |
Optional |
| Variable | Description |
|---|---|
XAI_INSTRUCTIONS |
Inline system prompt (highest priority) |
XAI_URL_INSTRUCTIONS |
HTTP endpoint returning { "system": "..." } (receives X-AVR-UUID) |
XAI_FILE_INSTRUCTIONS |
Local file path with plain-text instructions |
| (fallback) | Built-in default assistant prompt |
.envXAI_API_KEY=your_xai_api_key
PORT=6041
XAI_MODEL=grok-voice-latest
XAI_VOICE=eve
XAI_INSTRUCTIONS="You are a helpful assistant that can answer questions and help with tasks."
XAI_TURN_DETECTION=server_vad
XAI_TRANSCRIPTION_MODEL=whisper-1
XAI_LANGUAGE=it
XAI_CONNECT_TIMEOUT_MS=15000
# XAI_BUILTIN_TOOLS=web_search,x_search
# XAI_COLLECTION_IDS=collection-id-1,collection-id-2
AMI_URL=http://avr-ami:6006
avr-sts-xai:
image: agentvoiceresponse/avr-sts-xai:1.0.0
platform: linux/x86_64
container_name: avr-sts-xai
restart: always
environment:
- PORT=6041
- XAI_API_KEY=${XAI_API_KEY}
- XAI_MODEL=${XAI_MODEL:-grok-voice-latest}
- XAI_VOICE=${XAI_VOICE:-eve}
- XAI_INSTRUCTIONS=${XAI_INSTRUCTIONS:-You are a helpful assistant that can answer questions and help with tasks.}
- XAI_TURN_DETECTION=${XAI_TURN_DETECTION:-server_vad}
- AMI_URL=${AMI_URL:-http://avr-ami:6006}
networks:
- avr
A full stack example is available in avr-infra docker-compose-xai.yml.
Point avr-core at the xAI STS connector WebSocket endpoint:
avr-core:
image: agentvoiceresponse/avr-core
platform: linux/x86_64
container_name: avr-core
restart: always
environment:
- PORT=5001
- STS_URL=ws://avr-sts-xai:6041
ports:
- 5001:5001
networks:
- avr
STS_URL uses ws:// with no path suffix (unlike HTTP-based ASR/LLM/TTS providers).
Client → connector
{"type":"init","uuid":"<session-uuid>"}{"type":"audio","audio":"<base64 pcm16 8kHz>"}Connector → client
{"type":"audio","audio":"<base64 pcm16 8kHz 20ms frame>"}{"type":"transcript","role":"user|agent","text":"..."}{"type":"interruption"}{"type":"error","message":"..."}See AVR STS Integration Implementation for the full protocol reference.
The connector ships with built-in AVR tools:
avr_transfer — transfer the call via AMIavr_hangup — hang up the call via AMICustom tools can be added under tools/ and avr_tools/. Ensure AMI_URL points at your AVR AMI service when using transfer/hangup.
Enable provider-side tools with a comma-separated list:
XAI_BUILTIN_TOOLS=web_search,x_search,file_search
When using file_search, set collection IDs:
XAI_COLLECTION_IDS=collection-id-1,collection-id-2
Each AVR client connection uses dedicated audio resamplers and a dedicated xAI WebSocket session. Multiple concurrent calls are supported on a single connector instance.
| Symptom | Check |
|---|---|
| Connector fails to start | XAI_API_KEY is set and valid |
| No audio from agent | STS_URL=ws://avr-sts-xai:6041 on avr-core; connector reachable on the Docker network |
| Transfer/hangup not working | AMI_URL points to avr-ami; AMI credentials match Asterisk |
| Upstream timeout | Increase XAI_CONNECT_TIMEOUT_MS or verify outbound access to api.x.ai |
| Invalid instructions from URL/file | Payload must include { "system": "..." } for URL mode; file must be readable plain text |