Provider Routing - Inworld AI Documentation

You can specify a model without a provider prefix (e.g., gpt-oss-120b instead of groq/gpt-oss-120b), and the API will automatically select a provider for you. Optionally, use the model_selection.provider field in your router config to control how providers are selected. By default, the provider with the lowest latency is selected, and if it fails, the next best provider is tried automatically.

See all models via the List Models endpoint. Current supported providers include OpenAI, Anthropic, Google Vertex AI, Google AI Studio, Cerebras, DeepInfra, Fireworks, Mistral, Groq, and xAI. Contact us to request a provider or model to be supported.

Provider configuration

Field	Type	Default	Description
`order`	`string[]`	-	Explicit list of providers to try, in order. When specified, providers are tried in this exact order.
`allow_fallbacks`	`boolean`	`true`	Whether to fall back to the next provider if the first one fails.

How provider selection works

When no provider.order is specified, the sort criteria determines the order providers are tried. If no sort is specified either, providers are ordered by latency (fastest first). When provider.order is specified, providers are tried in the exact order listed. sort does not apply to the provider order (but still applies to models fallbacks if specified). The ignore field applies to providers regardless of whether order is specified.

Examples

// Automatically selects the lowest-latency provider for gpt-oss-120b
// Falls back to next-best provider if it fails
{
  "variant_id": "auto-provider",
  "model_id": "gpt-oss-120b"
}

Execution order

When using provider routing with model fallbacks, the full execution order is:

Try providers for the primary model in provider.order order (if specified) or sorted by sort criteria (default: latency).
If all providers fail and models is specified, fall back to the models list, sorted by sort criteria.
If all models fail, return an error.

Use Cases

Reliability: Ensure your application continues working even if a specific provider is down
Cost Optimization: Route to cheaper providers or fall back to cheaper models
Performance: Prefer low-latency providers, or fall back to faster models for time-sensitive requests
Provider Control: Lock to specific providers for compliance or consistency

​Provider configuration

​How provider selection works

​Examples

​Execution order

​Use Cases

Provider configuration

How provider selection works

Examples

Execution order

Use Cases