Skip to main content
The TTS Playground is the easiest way to experiment with Inworld’s Text-to-Speech models—try out different voices, adjust parameters, and preview instant voice clones. Once you’re ready to go beyond testing and build into a real-time application, the API gives you full access to advanced features and integration options. In this quickstart, we’ll focus on the Text-to-Speech API, guiding you through your first request to generate high-quality, ultra-realistic speech from text.

Make your first TTS API request

1

Create an API key

Create an Inworld account.In Inworld Portal, generate an API key by going to Settings > API Keys. Copy the Base64 credentials.Set your API key as an environment variable.
export INWORLD_API_KEY='your-base64-api-key-here'
2

Prepare your first request

This is the simplest way to try Inworld TTS and works well for many applications — batch audio generation, pre-rendered content, and anywhere latency isn’t critical. If your application requires real-time, low-latency audio delivery, see the streaming example in the next step.For Python or JavaScript, create a new file called inworld_quickstart.py or inworld_quickstart.js. Copy the corresponding code into the file. For a curl request, copy the request.
import requests
import base64
import os

# Synchronous endpoint — returns complete audio in a single response.
# For low-latency or real-time use cases, use the streaming endpoint instead.
url = "https://api.inworld.ai/tts/v1/voice"

headers = {
    "Authorization": f"Basic {os.getenv('INWORLD_API_KEY')}",
    "Content-Type": "application/json"
}

payload = {
    "text": "What a wonderful day to be a text-to-speech model!",
    "voiceId": "Ashley",
    "modelId": "inworld-tts-1.5-max"
}

response = requests.post(url, json=payload, headers=headers)
response.raise_for_status()
result = response.json()
audio_content = base64.b64decode(result['audioContent'])

with open("output.mp3", "wb") as f:
    f.write(audio_content)
For Python, you may also have to install requests if not already installed.
pip install requests
3

Run the code

Run the code for Python or JavaScript, or enter the curl command into your terminal.
python inworld_quickstart.py
You should see a saved file called output.mp3. You can play this file with any audio player.

Stream your audio output

Now that you’ve made your first TTS API request, you can try streaming responses as well. Assuming you’ve already followed the instructions above to set up your API key:
1

Prepare your streaming request

First, create a new file called inworld_stream_quickstart.py for Python or inworld_stream_quickstart.js for JavaScript. Next, set your INWORLD_API_KEY as an environment variable. Finally, copy the following code into the file.For this streaming example, we’ll use Linear PCM format (instead of MP3), which we specify in the audio_config. We also include a Connection: keep-alive header to reuse the TCP+TLS connection across requests.
The first request to the API may be slower due to the initial TCP and TLS handshake. Subsequent requests on the same connection will be faster. Use Connection: keep-alive (and a persistent session in Python) to take advantage of connection reuse. See the low-latency examples in our API examples repo for more advanced techniques.
import requests
import base64
import os
import json
import wave
import io
import time

url = "https://api.inworld.ai/tts/v1/voice:stream"

payload = {
    "text": "What a wonderful day to be a text-to-speech model! I'm super excited to show you how streaming works.",
    "voice_id": "Ashley",
    "model_id": "inworld-tts-1.5-max",
    "audio_config": {
        "audio_encoding": "LINEAR16",
        "sample_rate_hertz": 48000,
    },
}

# Use a persistent session for connection reuse (TCP+TLS keep-alive)
session = requests.Session()
session.headers.update({
    "Authorization": f"Basic {os.getenv('INWORLD_API_KEY')}",
    "Content-Type": "application/json",
    "Connection": "keep-alive",
})

start_time = time.time()
ttfb = None
raw_audio_data = io.BytesIO()

with session.post(url, json=payload, stream=True) as response:
    response.raise_for_status()

    for line in response.iter_lines(decode_unicode=True):
        if line.strip():
            try:
                chunk = json.loads(line)
                result = chunk.get("result")
                if result and "audioContent" in result:
                    audio_chunk = base64.b64decode(result["audioContent"])
                    if ttfb is None:
                        ttfb = time.time() - start_time
                    # Skip WAV header (first 44 bytes) from each chunk
                    if len(audio_chunk) > 44:
                        raw_audio_data.write(audio_chunk[44:])
                        print(f"Received {len(audio_chunk)} bytes")
            except json.JSONDecodeError:
                continue

total_time = time.time() - start_time

with wave.open("output_stream.wav", "wb") as wf:
    wf.setnchannels(1)
    wf.setsampwidth(2)
    wf.setframerate(payload["audio_config"]["sample_rate_hertz"])
    wf.writeframes(raw_audio_data.getvalue())

print("Audio saved to output_stream.wav")
print(f"Time to first chunk: {ttfb:.3f}s" if ttfb else "No chunks received")
print(f"Total time: {total_time:.3f}s")

session.close()
2

Run the code

Run the code for Python or JavaScript. The console will print out as streamed bytes are written to the audio file.
python inworld_stream_quickstart.py
You should see a saved file called output_stream.wav. You can play this file with any audio player.

Next Steps

Now that you’ve tried out Inworld’s TTS API, you can explore more of Inworld’s TTS capabilities.