Alberto Gonzalez : From SIP to Tokens. Deterministic Telephony Meets Real-Time Voice AI
This session explores how to safely run real-time Voice AI inside deterministic telephony systems. We present an open-source production-oriented architecture built on: a SIP edge, FreeSWITCH and a streaming AI pipeline. FreeSWITCH serves as the deterministic media and call-control layer, enforcing routing rules, timeouts, fallback paths, and session state. Live audio is streamed into an AI pipeline for STT → inference/translation → TTS, with streaming responses injected back into the call. We will cover:
- Bridging RTP audio into a streaming AI pipeline
- Defining conversational SLOs
- Instrumenting stage-level metrics and visualizing them
- Isolating AI failures and enforcing deterministic fallbacks in real-time telephony. The session concludes with a live demo of a real-time voice solution running on FreeSWITCH, complete with signaling visibility, media metrics, and AI pipeline timing. This talk is aimed at engineers building measurable, resilient Voice AI systems on open telephony infrastructure.