As a more capable model, Realtime TTS 2.0 (
inworld-tts-2) is better suited for multilingual applications, offering better pronunciation, more accurate intonation, and more natural-sounding speech. Please share any feedback with us in Discord.Supported languages
inworld-tts-2, inworld-tts-1.5-max, and inworld-tts-1.5-mini support the following languages:
Arabic (ar) | Chinese (zh) | Dutch (nl) | English (en) |
French (fr) | German (de) | Hebrew (he) | Hindi (hi) |
Italian (it) | Japanese (ja) | Korean (ko) | Polish (pl) |
Portuguese (pt) | Russian (ru) | Spanish (es) |
inworld-tts-2 also offers experimental support for the following languages:
Afrikaans (af) | Albanian (sq) | Amharic (am) | Antankarana Malagasy |
Armenian (hy) | Assamese (as) | Azerbaijani (az) | Bagirmi Fulfulde |
Basque (eu) | Belarusian (be) | Bengali (bn) | Bulgarian (bg) |
Burmese (my) | Cantonese (yue) | Catalan (ca) | Cebuano (ceb) |
Croatian (hr) | Czech (cs) | Danish (da) | Eastern Bolivian Guaraní |
Eastern Yiddish (yih) | Egyptian Arabic (arz) | Estonian (et) | Filipino (fil) |
Finnish (fi) | Galician (gl) | Garhwali (gbm) | Georgian (ka) |
Greek (el) | Gujarati (gu) | Gulf Arabic (afb) | Haitian Creole (ht) |
Hijazi Arabic (acw) | Hungarian (hu) | Icelandic (is) | Indonesian (id) |
Javanese (jv) | Kannada (kn) | Kazakh (kk) | Konkani (kok) |
Lao (lo) | Latvian (lv) | Libyan Arabic (ayl) | Lithuanian (lt) |
Luxembourgish (lb) | Macedonian (mk) | Maithili (mai) | Malagasy (mg) |
Malay (ms) | Malayalam (ml) | Marathi (mr) | Masikoro Malagasy |
Mongolian (mn) | Najdi Arabic (ars) | Nepali (ne) | Northern Betsimisaraka Malagasy |
Northern Uzbek (uzn) | Norwegian Bokmål (nb) | Norwegian Nynorsk (nn) | Odia (or) |
Omani Arabic (acx) | Orma | Pashto (ps) | Persian (fa) |
Piedmontese (pms) | Plateau Malagasy | Pular | Pulaar |
Punjabi (pa) | Romanian (ro) | Serbian (sr) | Sindhi (sd) |
Sinhala (si) | Slovak (sk) | Slovenian (sl) | Southern Betsimisaraka Malagasy |
Swahili (sw) | Swedish (sv) | Tamil (ta) | Telugu (te) |
| Tesaka Malagasy | Thai (th) | Tunisian Arabic (aeb) | Turkish (tr) |
Ukrainian (uk) | Urdu (ur) | Uzbek (uz) | Vietnamese (vi) |
Võro (vro) | Welsh (cy) |
Cross-lingual support
TTS 2.0 supports cross-lingual synthesis, where the same voice can be used across multiple languages. Depending on the voice and language, the voice may retain the accent of the original voice language when speaking a different language (e.g., a French voice may speak English with a French accent). TTS-1.5 performs optimally when synthesizing text in the same language as the original voice. While cross-language synthesis is possible with certain voices, you’ll achieve the best quality, pronunciation, and naturalness by matching the voice’s native language to your text content. To generate more native-sounding speech in a given language, you can also use Voice Localization. Voice localization is supported for all models.Voice Localization
To localize a voice for native-sounding speech in a target language:Open the Voices page in Portal
In Inworld Portal, go to Voices and select the My voices tab. Hover over an English voice to open its details panel on the right.
Voice localization is currently only supported for voices where the original audio was in English. Support for additional languages is coming soon.
Choose a target language
Pick a target language from the dropdown and click Localize, which will start generating localized prompt candidates. This may take up to 30 seconds.

Review candidates
After generation, you’ll see a few localized prompt candidates. Listen to each and pick the one that sounds most natural and native, then click Save.If none sound right, you can click Regenerate. You may want to change the script before regenerating, since the script influences the generated voice.

Next steps
Voice Cloning
Clone a voice once and reuse it across languages.
Custom Pronunciation
Use IPA notation to control pronunciation of proper nouns and edge cases.
