- Instant Voice Cloning - Clone a voice in minutes, with only 5-15 seconds of audio. Also known as zero-shot cloning. Available to all users through Portal.
- Voice Cloning via API - Instant voice cloning via API. Useful for workflow automation or enabling your users to clone their own voices.
- Professional Voice Cloning - For the highest quality, fine-tune a model with 30+ minutes of audio.
Professional voice cloning is currently not publicly available. To get access, please reach out to our sales team.
Don’t have audio samples? Use Voice Design to create a voice from a text description instead.
Instant Voice Cloning
Go to Inworld Portal
In Portal, select TTS Playground from the left-hand side panel. In the TTS Playground, click Create Voice and select Clone.
Upload or record audio samples

- Upload: Drag and drop or browse to upload up to 3 audio files. Accepted formats: wav, mp3, webm. Maximum total size is 16MB. Audio samples longer than 15 seconds will be automatically trimmed to 15 seconds.
- Record: Click “Record audio” and record your audio. You can use the suggested scripts to help guide your recording, or use your own script. For best results, record in a quiet place to minimize background noise, avoid mic noise, and speak with a variety of emotions to capture the full range of the voice.
Check out our Voice Cloning Best Practices for helpful tips and tricks to improve the quality of your voices clones.
Test your cloned voice

Use your cloned voice via API

voiceId when making an API call. See our Quickstart to learn how to make your first API call.Voice Cloning API Reference And Examples
If you want to automate voice cloning (for example, to support creator onboarding at scale), use the Voice Cloning API.- API reference: Clone a voice
- Python example: example_voice_clone.py
- JavaScript example: example_voice_clone.js
Voice cloning has lower rate limits than regular speech synthesis. For details, see Rate limits.