Skip to main content
Inworld’s Realtime TTS models offer ultra-realistic, context-aware speech synthesis, zero data retention, and precise voice cloning capabilities, enabling developers to build natural and engaging experiences with human-like speech quality at an accessible price point. Our models can be accessed via API (streaming and non-streaming) or the TTS Playground.

Developer quickstart

Learn how to make your first API call with a guided tutorial.

TTS Playground

Try different TTS models and voice cloning in TTS Playground.

Code Examples

Browse ready-to-use GitHub samples for common use cases.
Using AI to code? Paste https://docs.inworld.ai/llms.txt into your assistant so it knows every page on this site. Want live search? Add the MCP server.

Models

Realtime TTS-2

Our flagship, top-ranked model — the best quality plus steerability

  • Natural language steering for more contextually aware speech
  • Support for 200+ languages and locales
  • Optimized for real-time use
  • High quality instant voice cloning
  • Enhanced timestamps with phonetic details and visemes

Realtime TTS 1.5 Max

Rich, expressive speech with maximum stability

  • Support for 15 languages
  • Optimized for real-time use (<200ms median latency)
  • High quality instant voice cloning

Realtime TTS 1.5 Mini

Our ultra-fast model — for when latency is the top priority

  • Ultra-low latency (~120ms median latency)
  • Support for 15 languages
  • High quality instant voice cloning
See the Models page for model IDs and full details.

Features

FeatureRealtime TTS-2Realtime TTS 1.5 MaxRealtime TTS 1.5 Mini
Quality                Top-ranked flagship — best quality and steerabilityHigh quality, maximum stabilityGreat quality at ultra-low latency
P50 Latency                200 ms200 ms120 ms
Instant voice cloning                
Professional voice cloning                
Custom pronunciation                
Multilingual                200+ languages15 languages15 languages
Steering                
Pause controls                
Timestamp alignment                
On-premises deployments                
Zero data retention