Inworld’s platform provides access to a wide variety of state-of-the-art models. These models offer diverse capabilities, performance levels, price points, and deployment options, enabling users to select and customize models that best match their specific use cases and application needs.Documentation Index
Fetch the complete documentation index at: https://dev.docs.inworld.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview of Model Offerings
This section provides some high-level context on Inworld’s model offerings, and how they can be used in your application.- TTS: Text-to-Speech models can be used to generate high-quality audio for your application, such as powering a character’s voice.
- LLM: Large Language Models are powerful models that can intake inputs (typically text, but certain models may also support other modalities) and generate text outputs. These models can be used to determine in-game actions, power conversations, generate dynamic narratives, and more.
- Embeddings: Embeddings models convert text into high-dimensional vectors, which can be used to power intent detection, text similarity comparison, and retrieval-augmented generation (RAG).
TTS
Inworld’s Agent Runtime and API offers access to Inworld’s family of state-of-the-art TTS models, optimized for different use cases, quality levels, and performance requirements.Realtime TTS 2.0
Our most powerful and expressive model, available in Research Preview
- Natural language steering for more contextually aware speech
- Support for 100+ languages
- Optimized for real-time use
- High quality instant voice cloning
- Enhanced timestamps with phonetic details and visemes
Realtime TTS 1.5 Max
Our flagship model, delivering the best balance of quality and speed
- Rich, expressive, contextually aware speech
- Support for 15 languages
- Optimized for real-time use (<200ms median latency)
- High quality instant voice cloning
- Enhanced timestamps with phonetic details and visemes
Realtime TTS 1.5 Mini
Our ultra-fast, most cost-efficient model. For when latency is the top priority.
- Ultra-low latency (~120ms median latency)
- Support for 15 languages
- Radically affordable pricing
- High quality instant voice cloning
- Enhanced timestamps with phonetic details and visemes
Models overview
| Name | Model ID | Description | Supported languages |
|---|---|---|---|
| Llama Realtime TTS 2.0 | inworld-tts-2 | Our newest, most powerful model with natural language steering and stronger multilingual capabilities | 100+ languages — see Languages |
| Llama Realtime TTS 1.5 Max | inworld-tts-1.5-max | #1 ranked model, best balance of quality and speed, with enhanced timestamps | en, zh, ja, ko, ru, it, es, pt, fr, de, pl, nl, hi, he, ar |
| Llama Realtime TTS 1.5 Mini | inworld-tts-1.5-mini | Ultra-fast, most cost-efficient model, with enhanced timestamps | en, zh, ja, ko, ru, it, es, pt, fr, de, pl, nl, hi, he, ar |
| Llama Realtime TTS Max Deprecated | inworld-tts-1-max | Our most powerful previous generation model, with basic timestamps support | en, de, es, fr, it, ja, ko, nl, pl, pt, ru, zh, hi |
| Llama Realtime TTS Deprecated | inworld-tts-1 | Our fastest previous generation model, with basic timestamps support | en, de, es, fr, it, ja, ko, nl, pl, pt, ru, zh, hi |
inworld-tts-1 and inworld-tts-1-max are deprecated and will be retired in the near future. We will communicate the exact retirement date once finalized to users with advance notice to ensure a smooth transition. We recommend migrating to inworld-tts-1.5-mini and inworld-tts-1.5-max as soon as possible to avoid disruptions.LLM
Chat Completion
Inworld provides access to hundreds of LLMs from various providers through a unified Chat Completions API.- Available models: See the List Models API or the Models page in the Inworld Portal for the full list of supported models and providers
- Pricing: Visit inworld.ai/pricing for model pricing details
- Specific model: Call a model directly using the
provider/modelformat (e.g.,openai/gpt-5) - Auto-select: Set
modelto"auto"to automatically pick the best model based on price, latency, or performance - Router: Create a router for conditional routing, A/B testing, and reusable configurations, then reference it via the
modelfield (e.g.,my-router)
Embeddings
| Service Provider | Model ID | Description | |
|---|---|---|---|
| Inworld | SERVICE_PROVIDER_INWORLD | BAAI/bge-large-en-v1.5 | Great for English text |
| Inworld | SERVICE_PROVIDER_INWORLD | sentence-transformers/paraphrase-multilingual-mpnet-base-v2 | Great for multi-lingual text |
| OpenAI | SERVICE_PROVIDER_OPENAI | text-embedding-3-small | General purpose, efficient embedder with support for English and multi-lingual text. |
Terms of Service
You may not violate the terms of service or policies of third-party model providers using Inworld’s platform or your account will be subject to deactivation.- Anthropic: https://www.anthropic.com/legal/commercial-terms
- Cerebras: https://www.cerebras.ai/terms-of-service
- DeepInfra: https://deepinfra.com/terms
- Fireworks: https://fireworks.ai/terms-of-service
- Google Vertex: https://cloud.google.com/terms/
- Groq: https://groq.com/terms-of-use
- Mistral: https://mistral.ai/terms/#terms-of-use
- OpenAI: https://openai.com/policies/row-terms-of-use/
- Tenstorrent: https://tenstorrent.com/terms
- XAI: https://x.ai/legal/terms