Skip to main content
Use Realtime TTS in your multilingual applications, with support for 100+ languages and cross-lingual capabilities that let you reuse a single voice across multiple languages.
As a more capable model, Realtime TTS 2.0 (inworld-tts-2) is better suited for multilingual applications, offering better pronunciation, more accurate intonation, and more natural-sounding speech. Please share any feedback with us in Discord.

Supported languages

inworld-tts-2, inworld-tts-1.5-max, and inworld-tts-1.5-mini support the following languages:
Arabic (ar)Chinese (zh)Dutch (nl)English (en)
French (fr)German (de)Hebrew (he)Hindi (hi)
Italian (it)Japanese (ja)Korean (ko)Polish (pl)
Portuguese (pt)Russian (ru)Spanish (es)
inworld-tts-2 also offers experimental support for the following languages:
Afrikaans (af)Albanian (sq)Amharic (am)Antankarana Malagasy
Armenian (hy)Assamese (as)Azerbaijani (az)Bagirmi Fulfulde
Basque (eu)Belarusian (be)Bengali (bn)Bulgarian (bg)
Burmese (my)Cantonese (yue)Catalan (ca)Cebuano (ceb)
Croatian (hr)Czech (cs)Danish (da)Eastern Bolivian Guaraní
Eastern Yiddish (yih)Egyptian Arabic (arz)Estonian (et)Filipino (fil)
Finnish (fi)Galician (gl)Garhwali (gbm)Georgian (ka)
Greek (el)Gujarati (gu)Gulf Arabic (afb)Haitian Creole (ht)
Hijazi Arabic (acw)Hungarian (hu)Icelandic (is)Indonesian (id)
Javanese (jv)Kannada (kn)Kazakh (kk)Konkani (kok)
Lao (lo)Latvian (lv)Libyan Arabic (ayl)Lithuanian (lt)
Luxembourgish (lb)Macedonian (mk)Maithili (mai)Malagasy (mg)
Malay (ms)Malayalam (ml)Marathi (mr)Masikoro Malagasy
Mongolian (mn)Najdi Arabic (ars)Nepali (ne)Northern Betsimisaraka Malagasy
Northern Uzbek (uzn)Norwegian Bokmål (nb)Norwegian Nynorsk (nn)Odia (or)
Omani Arabic (acx)OrmaPashto (ps)Persian (fa)
Piedmontese (pms)Plateau MalagasyPularPulaar
Punjabi (pa)Romanian (ro)Serbian (sr)Sindhi (sd)
Sinhala (si)Slovak (sk)Slovenian (sl)Southern Betsimisaraka Malagasy
Swahili (sw)Swedish (sv)Tamil (ta)Telugu (te)
Tesaka MalagasyThai (th)Tunisian Arabic (aeb)Turkish (tr)
Ukrainian (uk)Urdu (ur)Uzbek (uz)Vietnamese (vi)
Võro (vro)Welsh (cy)

Cross-lingual support

TTS 2.0 supports cross-lingual synthesis, where the same voice can be used across multiple languages. Depending on the voice and language, the voice may retain the accent of the original voice language when speaking a different language (e.g., a French voice may speak English with a French accent). TTS-1.5 performs optimally when synthesizing text in the same language as the original voice. While cross-language synthesis is possible with certain voices, you’ll achieve the best quality, pronunciation, and naturalness by matching the voice’s native language to your text content. To generate more native-sounding speech in a given language, you can also use Voice Localization. Voice localization is supported for all models.

Voice Localization

To localize a voice for native-sounding speech in a target language:
1

Open the Voices page in Portal

In Inworld Portal, go to Voices and select the My voices tab. Hover over an English voice to open its details panel on the right.
Voice localization is currently only supported for voices where the original audio was in English. Support for additional languages is coming soon.
2

Click Localize Voice

With a voice selected, click Localize Voice in the right panel.Voices page with Localize Voice action
3

Choose a target language

Pick a target language from the dropdown and click Localize, which will start generating localized prompt candidates. This may take up to 30 seconds.Localize Voice screen with target language dropdown
4

Review candidates

After generation, you’ll see a few localized prompt candidates. Listen to each and pick the one that sounds most natural and native, then click Save.If none sound right, you can click Regenerate. You may want to change the script before regenerating, since the script influences the generated voice.Review candidates screen with multiple localized voice options
5

Use the localized voice

Once saved, the voice will automatically use your localized prompt when synthesizing text in that language. Try it out in TTS Playground or via API (the voice ID remains the same).

Next steps

Voice Cloning

Clone a voice once and reuse it across languages.

Custom Pronunciation

Use IPA notation to control pronunciation of proper nouns and edge cases.