Skip to main content
This template project demonstrates how to use the Inworld Runtime in a full-stack Node.js app. Using the app, a person learning Spanish can have natural, realtime conversations with a fluent Spanish speaker who will gently correct their mistakes. Meanwhile, throughout the conversation flashcards are generates for relevant vocabulary words. These flashcards can then be exported to an Anki deck for further spaced-repetition study. Key concepts demonstrated:
  • Voice Activity Detection (VAD) - for parsing speech activity out of open mic audio
  • Speech-to-text (STT) - for understanding speech inputs
  • Jinja prompt templating - for passing app state into formatted context and instructions for an LLM
  • LLM - for generating the agent text response
  • Text-to-speech (TTS) - for generating agent speech audio
Architecture
  • Backend: Inworld Runtime + Express.js
  • Frontend: Vanilla HTML/CSS/JavaScript
  • Communication: WebSocket

Understanding the Template

Depending on your learning style, you may want to:
  • Watch the tutorial videos walking through how the functionality is implemented using the Inworld Runtime
  • Clone the open-source GitHub repo to investigate the full code context (or add new features!)

App Design and Inworld Basics

Prompting and LLM Calls

Text-to-Speech

Jinja Templating for LLM Prompting

Adding a Second Graph for Flashcard Generation