Realtime TTS 2.0 is available now in research preview! Learn more
curl --location 'https://api.inworld.ai/voices/v1/voices:design' \
--header "Authorization: Basic $INWORLD_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"langCode": "EN_US",
"designPrompt": "Warm, friendly, conversational voice with a subtle smile; natural pacing; clear articulation.",
"previewText": "Hey! I am here. What can I help you with today?",
"voiceDesignConfig": {
"numberOfSamples": 1
}
}'{
"langCode": "EN_US",
"previewVoices": [
{
"voiceId": "your_workspace_id__design-voice-38b05df9",
"previewText": "Hey! I am here. What can I help you with today? I would be happy to assist you with whatever you need. Just let me know how I can be of service.",
"previewAudio": "<base64-audio>"
}
]
}Design a voice based on a text description. Returns preview voices that can be published using the Publish Voice endpoint.
curl --location 'https://api.inworld.ai/voices/v1/voices:design' \
--header "Authorization: Basic $INWORLD_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"langCode": "EN_US",
"designPrompt": "Warm, friendly, conversational voice with a subtle smile; natural pacing; clear articulation.",
"previewText": "Hey! I am here. What can I help you with today?",
"voiceDesignConfig": {
"numberOfSamples": 1
}
}'{
"langCode": "EN_US",
"previewVoices": [
{
"voiceId": "your_workspace_id__design-voice-38b05df9",
"previewText": "Hey! I am here. What can I help you with today? I would be happy to assist you with whatever you need. Just let me know how I can be of service.",
"previewAudio": "<base64-audio>"
}
]
}Design a voice from a text description. This endpoint generates up to three preview voices that can then be published to your voice library using the Publish Voice endpoint. For a guided workflow, see Voice Design in the docs.Documentation Index
Fetch the complete documentation index at: https://dev.docs.inworld.ai/llms.txt
Use this file to discover all available pages before exploring further.
previewText must result in generated audio that is 1-15 seconds long (~50-200 characters in English).Request message for DesignVoice.
Text description of the desired voice. Must be in English and between 30 and 250 characters. For best results, include age, gender, accent, pitch, pace, and tone. See Voice Design Best Practices for more details.
Example: "A middle-aged male voice with a clear British accent speaking at a steady pace and with a neutral tone."
Language code for the voice preview.
EN_US, ZH_CN, KO_KR, JA_JP, RU_RU, AUTO, IT_IT, ES_ES, PT_BR, DE_DE, FR_FR, AR_SA, PL_PL, NL_NL, HI_IN, HE_IL Script for the generated voice to speak. Must result in audio that is 1-15 seconds.
The script will shape the voice that gets generated, as the model will tailor the voice to suit the content it's speaking. See Voice Design Best Practices for more details.
Voice design configuration for generating the preview. If not provided, defaults to generating 1 sample.
Show child attributes
A successful response.
Response message for DesignVoice.
The language code of the generated previews.
EN_US, ZH_CN, KO_KR, JA_JP, RU_RU, AUTO, IT_IT, ES_ES, PT_BR, DE_DE, FR_FR, AR_SA, PL_PL, NL_NL, HI_IN, HE_IL Preview voices generated (in DRAFT status). Up to 3 voices will be generated each time you call this endpoint. Use Publish Voice to promote one to your library.
Show child attributes