Skip to main content

Documentation Index

Fetch the complete documentation index at: https://dev.docs.inworld.ai/llms.txt

Use this file to discover all available pages before exploring further.

The TTS Playground is the easiest way to experiment with Inworld’s Text-to-Speech models—try out different voices, adjust parameters, and preview instant voice clones. Once you’re ready to go beyond testing and build into a real-time application, the API gives you full access to advanced features and integration options. In this quickstart, we’ll focus on the Text-to-Speech API, guiding you through your first request to generate high-quality, ultra-realistic speech from text.

Make your first streaming TTS API request

This quickstart walks through making your first streaming API request, which we recommend for realtime, low-latency applications. For batch audio generation, pre-rendered content, and anywhere latency isn’t critical, see Make a non-streaming request below.
1

Create an API key

Create an Inworld account.In Inworld Portal, generate an API key by going to Settings > API Keys. Copy the Base64 credentials.Set your API key as an environment variable.
export INWORLD_API_KEY='your-base64-api-key-here'
2

Install the SDK

Install the Realtime TTS SDK in the language of your choice.
npm install @inworld/tts
3

Prepare your first streaming request

Create a new file called inworld_stream_quickstart.py or inworld_stream_quickstart.js, confirm INWORLD_API_KEY is set in your environment, and copy the code below into the file. This example uses Linear PCM so the streamed chunks can be written directly into a WAV file.
import { InworldTTS } from '@inworld/tts';
import fs from 'fs';

const tts = InworldTTS();
const chunks = [];

for await (const chunk of tts.stream({
    text: "What a wonderful day to be a text-to-speech model! I'm super excited to show you how streaming works.",
    voice: 'Ashley',
    encoding: 'LINEAR16',
    sampleRate: 48000,
})) {
    chunks.push(chunk);
    console.log(`Received ${chunk.length} bytes`);
}

fs.writeFileSync('output_stream.wav', Buffer.concat(chunks));
console.log('Audio saved to output_stream.wav');
4

Run the code

Run the code for Python or JavaScript. The console will print out as streamed bytes are written to the audio file.
node inworld_stream_quickstart.js
You should see a saved file called output_stream.wav. You can play this file with any audio player.

Make a non-streaming request

The synchronous endpoint is the simplest way to try Realtime TTS and works well for batch audio generation, pre-rendered content, and anywhere latency isn’t critical. Assuming you’ve already set up your API key and installed the SDK:
1

Prepare your first request

For Python or JavaScript, create a new file called inworld_quickstart.py or inworld_quickstart.js. Copy the corresponding code into the file.
import { InworldTTS } from '@inworld/tts';
import fs from 'fs';

const tts = InworldTTS();

const audio = await tts.generate({
    text: 'What a wonderful day to be a text-to-speech model!',
    voice: 'Ashley',
});

fs.writeFileSync('output.mp3', audio);
console.log('Audio saved to output.mp3');
2

Run the code

Run the code for Python or JavaScript, or enter the curl command into your terminal.
node inworld_quickstart.js
You should see a saved file called output.mp3. You can play this file with any audio player.

Next Steps

Now that you’ve tried out the Realtime TTS API, you can explore more Realtime TTS capabilities.

Realtime TTS

Understand the capabilities of Inworld’s Realtime TTS models.

Voice Cloning

Create a personalized voice clone with just 5 seconds of audio.

Best Practices

Learn tips and tricks for synthesizing high-quality speech.