Speechmatics provides enterprise-grade speech recognition and synthesis.
With the Speech-to-Speech (STS) integration, Agent Voice Response (AVR) can handle real-time conversations where speech is directly transformed into speech—without requiring separate ASR and TTS services.
This approach reduces latency and simplifies the overall architecture.
End-to-End Speech Conversations
Direct speech-in → speech-out processing with no intermediate text pipeline.
Low Latency Streaming
Optimized real-time audio streaming via WebSocket.
Production-Grade Quality
High-accuracy ASR combined with natural-sounding TTS.
Flexible AI Control
Uses OpenAI models for response generation while Speechmatics handles speech processing.
Native AVR Integration
Fully compatible with AVR Core via the standard STS interface.
Configure the Speechmatics STS container using the following environment variables:
# Required
SPEECHMATICS_API_KEY=your_speechmatics_api_key
Add the Speechmatics STS service to your docker-compose.yml:
avr-sts-speechmatics:
image: agentvoiceresponse/avr-sts-speechmatics
platform: linux/x86_64
container_name: avr-sts-speechmatics
restart: always
environment:
- SPEECHMATICS_API_KEY=${SPEECHMATICS_API_KEY}
- PORT=6040
networks:
- avr
Point avr-core to the Speechmatics STS service:
avr-core:
image: agentvoiceresponse/avr-core
platform: linux/x86_64
container_name: avr-core
restart: always
environment:
- PORT=5001
- STS_URL=ws://avr-sts-speechmatics:6040
ports:
- 5001:5001
networks:
- avr
ℹ️ When using STS providers, AVR Core bypasses ASR + LLM + TTS and delegates speech handling entirely to the STS engine.