The TTS Playground is the easiest way to experiment with Inworld’s Text-to-Speech models—try out different voices, adjust parameters, and preview instant voice clones. Once you’re ready to go beyond testing and build into a real-time application, the API gives you full access to advanced features and integration options. In this quickstart, we’ll focus on the Text-to-Speech API, guiding you through your first request to generate high-quality, ultra-realistic speech from text.Documentation Index
Fetch the complete documentation index at: https://dev.docs.inworld.ai/llms.txt
Use this file to discover all available pages before exploring further.
Make your first streaming TTS API request
This quickstart walks through making your first streaming API request, which we recommend for realtime, low-latency applications. For batch audio generation, pre-rendered content, and anywhere latency isn’t critical, see Make a non-streaming request below.Create an API key
Create an Inworld account.In Inworld Portal, generate an API key by going to Settings > API Keys. Copy the Base64 credentials.
Set your API key as an environment variable.

Prepare your first streaming request
Create a new file called
inworld_stream_quickstart.py or inworld_stream_quickstart.js, confirm INWORLD_API_KEY is set in your environment, and copy the code below into the file. This example uses Linear PCM so the streamed chunks can be written directly into a WAV file.Make a non-streaming request
The synchronous endpoint is the simplest way to try Realtime TTS and works well for batch audio generation, pre-rendered content, and anywhere latency isn’t critical. Assuming you’ve already set up your API key and installed the SDK:Prepare your first request
For Python or JavaScript, create a new file called
inworld_quickstart.py or inworld_quickstart.js. Copy the corresponding code into the file.Next Steps
Now that you’ve tried out the Realtime TTS API, you can explore more Realtime TTS capabilities.Realtime TTS
Understand the capabilities of Inworld’s Realtime TTS models.
Voice Cloning
Create a personalized voice clone with just 5 seconds of audio.
Best Practices
Learn tips and tricks for synthesizing high-quality speech.