If you’re building voice experiences that go beyond basic IVRs or traditional prompts, OpenAI’s Realtime API offers a major step forward. It lets you stream live audio to an AI model and get intelligent, spoken responses in real time—ideal for virtual agents, intelligent call routing, or any conversational automation.
As a Bandwidth customer, you can connect to OpenAI’s Realtime API in two different ways, depending on your technical stack and control needs:
- Via Bandwidth Programmable Voice using bi‑directional media streaming (WebSockets)
- Via SIP directly connecting your Bandwidth SIP trunks to OpenAI
Below we’ll look at how each option works, what control you get, and how to decide between them.
Option 1: Programmable voice + bi-directional media streaming (WebSockets)
Bandwidth’s Programmable Voice API supports real‑time media streaming, which makes it a natural fit for connecting with OpenAI’s Realtime API. In this setup, your application uses BXML to open a WebSocket stream, sending call audio to OpenAI and receiving AI‑generated audio back as it happens.
This architecture keeps your app at the center of call control—you get full visibility into events, call states, and the ability to inject logic, data lookups, or other integrations on the fly.
When to use it
- You want fine‑grained control over call behavior (start/stop streaming, connect/disconnect logic).
- You already use Bandwidth’s Programmable Voice platform for call flows.
- You’re experimenting with AI agents or creating custom voice bots where application logic matters.
Option 2: SIP integration with OpenAI In Beta now
If your environment is already SIP‑based (PBX, SBC, or SIP trunking) you can take a more direct path by routing calls to OpenAI via SIP. Bandwidth can provision a SIP endpoint that connects to OpenAI’s Realtime API, so OpenAI essentially acts as the destination of the SIP call.
Your SIP softswitch or PBX can stay mostly unchanged—no need to handle WebSockets or a custom app layer if you just want to route a call to an AI agent.
When to use it
- You run a SIP‑based contact center or enterprise voice system.
- You prefer standard SIP routing instead of building a custom WebSocket bridge.
- You want a simple and direct way to connect live calls to OpenAI for testing or production workloads.
Compare OpenAI Realtime API integration options at a glance
| Feature | Programmable Voice (WebSocket) | SIP Integration |
|---|---|---|
| Connection type | WebSocket media streaming | SIP signaling + RTP |
| Control level | High—your app manages logic | Moderate—handled through SIP routing |
| Ideal for | Developers building fully custom call flows | Teams integrating AI with SIP infrastructure |
| Setup complexity | Requires WebSocket app, BXML logic | Standard SIP configuration |
| Use cases | AI voice assistants, dynamic IVR | Contact center routing, quick AI trials |
Next steps toward your OpenAI integration
Both integration paths deliver real‑time conversational experiences with OpenAI’s models through Bandwidth’s reliable voice network. Which one you choose depends on how much control and customization you need:
- If you want programmability and logic flexibility → go with the Programmable Voice WebSocket option.
- If you want straightforward SIP connectivity → start with the SIP integration.
You can explore full setup guides and code examples in the Bandwidth Developer Docs .