Use this file to discover all available pages before exploring further.
Build a browser-based voice agent that streams audio to the Inworld Realtime API using WebRTC. Audio is handled natively by the browser — no manual PCM encoding or base64 conversion needed.
WebRTC is ideal for browser voice apps with low latency. For server-side integrations, see the WebSocket Quickstart.
Create an Inworld account.In Inworld Portal, generate an API key by going to Settings > API Keys. Copy the Base64 credentials.Create a .env file:
.env
INWORLD_API_KEY=your-base64-api-key-here
2
Create the server
Create server.js. It serves the page and provides a /api/config endpoint that fetches ICE servers from the WebRTC proxy while keeping the API key server-side.
If you’re building a more advanced voice agent with features like agent handoffs, tool calling, and guardrails, you can use the OpenAI Agents SDK with Inworld’s WebRTC proxy. We provide a ready-to-run playground based on OpenAI’s realtime agents demo.
If you are unable to access this repository, please contact support@inworld.ai for access.
2
Configure the API key
Open .env and set OPENAI_API_KEY to your Inworld API key (the same Base64 credentials from Inworld Portal):
.env
OPENAI_API_KEY=your-inworld-base64-api-key-here
Despite the variable name OPENAI_API_KEY, this must be your Inworld API key — not an OpenAI key. The SDK uses this variable name by convention, but the playground routes all traffic through the Inworld WebRTC proxy.
3
Run
npm run dev
Open http://localhost:3000. Select a scenario from the Scenario dropdown and start talking.
The playground includes two agentic patterns:
Chat-Supervisor — A realtime chat agent handles basic conversation while a more capable text model (e.g. gpt-4.1) handles tool calls and complex responses.
Sequential Handoff — Specialized agents transfer the user between them to handle specific intents (e.g. authentication → returns → sales).
For full details on customizing agents, see the playground’s README.