Skip to main content
Pause controls let you manage pacing in generated speech. Use SSML break tags to insert silences at specific points in the generated speech.

SSML break tags

Use when you need precise control over silence duration and position. You can insert silences at specific points in the generated speech. The TTS API and Inworld Portal support SSML <break time="1s" /> in text input for streaming, non-streaming, and WebSocket requests, in all languages. You can specify silences in milliseconds or seconds. For example, <break time="1000ms" /> and <break time="1s" /> produce the same result. Constraints:
  • Use well-formed SSML: specify the slash and brackets—for example, <break time="1s" />.
  • Tag names and attributes are case insensitive; for example, <BREAK time="2s" /> works.
  • Up to 20 break tags are supported per request. After the first 20 tags, the remaining ones will be ignored.
  • Each break is at most 10 seconds—for example, time="10s" or time="10000ms".
Example:
One second pause <break time="1s" /> two seconds pause <BREAK time="2s" /> this is the end.<break time="500ms" />