node-tts template illustrates how to convert text-to-speech using the TTS node.
Architecture
- Backend: Inworld Runtime
- Frontend: N/A (CLI example)
Run the Template
- Download and extract the Inworld Templates.
- 
Install the Runtime SDK inside the clidirectory.
- 
Set up your Base64 Runtime API key by copying the .env-samplefile into a.envfile in theclifolder and adding your API key..env
- 
Try a different model or voice! You can specify the model using the --modelIdparameter and a voice using the--voiceNameparameter:
Understanding the Template
The main functionality of the template is contained in the run function, which demonstrates how to use the Inworld Runtime to convert text-to-speech using the TTS node. Now let’s break down the template into more detail:1) Node Initialization
We start by creating the TTS node.- id: A unique identifier for the node
- speakerId: The voice to use for synthesis (see available voices)
- modelId: The TTS model to use for synthesis
- sampleRate: Audio output sample rate
- temperature: Controls randomness in synthesis
- speakingRate: Controls the speed of speech (1.0 is the voice’s natural speed)
2) Graph initialization
Next, we create the graph using the GraphBuilder, adding the TTS node and setting it as both start and end node:- id: A unique identifier for the graph
- apiKey: Your Inworld API key for authentication
- enableRemoteConfig: Whether to enable remote configuration (set to false for local execution)
3) Graph execution
Now we execute the graph with the text input directly:4) Response handling
The audio generation results are handled using theprocessResponse method, which supports streaming audio responses:
- TTSOutputStream: Streaming audio responses containing both text and audio data
- chunk.text: The text being synthesized
- chunk.audio.data: The audio data as Float32Array samples