Models - Inworld AI Documentation

Inworld’s platform provides access to a wide variety of state-of-the-art models. These models offer diverse capabilities, performance levels, price points, and deployment options, enabling users to select and customize models that best match their specific use cases and application needs.

Overview of Model Offerings

This section provides some high-level context on Inworld’s model offerings, and how they can be used in your application.

TTS: Text-to-Speech models can be used to generate high-quality audio for your application, such as powering a character’s voice.
LLM: Large Language Models are powerful models that can intake inputs (typically text, but certain models may also support other modalities) and generate text outputs. These models can be used to determine in-game actions, power conversations, generate dynamic narratives, and more.
STT: Speech-to-Text models can be used to transcribe text from audio, powering voice-driven interactions and real-time transcription features in your application.
Embeddings: Embeddings models convert text into high-dimensional vectors, which can be used to power intent detection, text similarity comparison, and retrieval-augmented generation (RAG).

TTS

Inworld’s Runtime and API offers access to Inworld’s family of state-of-the-art TTS models, optimized for different use cases, quality levels, and performance requirements.

Support for additional TTS providers in Runtime is coming soon!

Inworld TTS 1.5 Max

Our flagship model, delivering the best balance of quality and speed

Rich, expressive, contextually aware speech
Support for 15 languages
Optimized for real-time use (<200ms median latency)
High quality instant voice cloning
Enhanced timestamps with phonetic details and visemes

Inworld TTS 1.5 Mini

Our ultra-fast, most cost-efficient model. For when latency is the top priority.

Ultra-low latency (~120ms median latency)
Support for 15 languages
Radically affordable pricing
High quality instant voice cloning
Enhanced timestamps with phonetic details and visemes

Models overview

Name	Model ID	Description	Supported languages
Llama Inworld TTS 1.5 Max	`inworld-tts-1.5-max`	Flagship model, best balance of quality and speed, with enhanced timestamps	`en`, `zh`, `ja`, `ko`, `ru`, `it`, `es`, `pt`, `fr`, `de`, `pl`, `nl`, `hi`, `he`, `ar`
Llama Inworld TTS 1.5 Mini	`inworld-tts-1.5-mini`	Ultra-fast, most cost-efficient model, with enhanced timestamps	`en`, `zh`, `ja`, `ko`, `ru`, `it`, `es`, `pt`, `fr`, `de`, `pl`, `nl`, `hi`, `he`, `ar`
Llama Inworld TTS Max	`inworld-tts-1-max`	Our most powerful previous generation model, with basic timestamps support	`en`, `de`, `es`, `fr`, `it`, `ja`, `ko`, `nl`, `pl`, `pt`, `ru`, `zh`, `hi`
Llama Inworld TTS	`inworld-tts-1`	Our fastest previous generation model, with basic timestamps support	`en`, `de`, `es`, `fr`, `it`, `ja`, `ko`, `nl`, `pl`, `pt`, `ru`, `zh`, `hi`

LLM

Inworld’s SDKs and LLM API offers access to cloud-hosted LLMs via the Chat Completion endpoint. To call a model, you’ll need to specify both the model name and the service provider, which is the provider hosting the model. Below is an overview of the available service providers and models.

Chat Completion

When specifying a model name (e.g., “gpt-5”, “claude-opus-4-1”), use the exact model identifier (with the same capitalization) as listed in the provider’s official documentation.

	Provider	Model
Anthropic	anthropic	Any Anthropic LLMs, such as: claude-opus-4-5 claude-haiku-4-5 claude-sonnet-4-5 claude-opus-4-1 claude-opus-4-0 claude-sonnet-4-0 claude-3-7-sonnet-20250219 claude-3-5-haiku-20241022 claude-3-5-haiku-latest
Cerebras	cerebras	Any Cerebras LLMs, such as: llama3.1-8b llama-3.3-70b gpt-oss-120b qwen-3-32b
DeepInfra	deepinfra	Any DeepInfra LLMs, such as: Qwen/Qwen2.5-72B-Instruct mistralai/Mixtral-8x7B-Instruct-v0.1
Fireworks	fireworks	Any Fireworks LLMs, such as: accounts/fireworks/models/gpt-oss-120b accounts/fireworks/models/gpt-oss-20b accounts/fireworks/models/deepseek-v3-0324
Google (Gemini)	google	Any Gemini LLMs, such as: gemini-3-pro-preview gemini-3-flash-preview gemini-2.5-pro gemini-2.5-flash gemini-2.5-flash-lite
Groq	groq	Any Groq LLMs, such as: llama-3.3-70b-versatile llama-3.1-8b-instant openai/gpt-oss-20b
Mistral	mistral	mistral-large-latest mistral-medium-latest mistral-small-latest mistral-tiny-latest pixtral-12b-2409 ministral-8b-latest
OpenAI	openai	Any OpenAI LLMs, such as: gpt-5.2 gpt-4o gpt-4o-2024-11-20 gpt-4o-mini gpt-4-turbo gpt-4.1 gpt-3.5-turbo
Tenstorrent	tenstorrent	tenstorrent/Llama-3.3-70B-Instruct
XAI	xai	Any XAI Grok LLMs, such as: grok-4-0709 grok-4 grok-3 grok-3-mini

Embeddings

	Provider	Model ID	Description
Inworld	inworld	`BAAI/bge-large-en-v1.5`	Great for English text
Inworld	inworld	`sentence-transformers/paraphrase-multilingual-mpnet-base-v2`	Great for multi-lingual text

Terms of Service

You may not violate the terms of service or policies of third-party model providers using Inworld’s platform or your account will be subject to deactivation.

Anthropic: https://www.anthropic.com/legal/commercial-terms
Cerebras: https://www.cerebras.ai/terms-of-service
DeepInfra: https://deepinfra.com/terms
Fireworks: https://fireworks.ai/terms-of-service
Google Vertex: https://cloud.google.com/terms/
Groq: https://groq.com/terms-of-use
Mistral: https://mistral.ai/terms/#terms-of-use
OpenAI: https://openai.com/policies/row-terms-of-use/
Tenstorrent: https://tenstorrent.com/terms
XAI: https://x.ai/legal/terms

​Overview of Model Offerings

​TTS

Inworld TTS 1.5 Max

​Our flagship model, delivering the best balance of quality and speed

Inworld TTS 1.5 Mini

​Our ultra-fast, most cost-efficient model. For when latency is the top priority.

​Models overview

​LLM

​Chat Completion

​Embeddings

​Terms of Service

Overview of Model Offerings

TTS

Our flagship model, delivering the best balance of quality and speed

Our ultra-fast, most cost-efficient model. For when latency is the top priority.

Models overview

LLM

Chat Completion

Embeddings

Terms of Service