Overview
When you specify an exact model like"model": "openai/gpt-5", you can optionally use extra_body to add fallback models and optimization preferences.
You can also specify a model without a provider (e.g., "model": "gpt-oss-120b") and let the API automatically select the best provider, or control provider selection with the provider configuration.
Request Structure
Example
How It Works
- The API first attempts to use the specified model (
openai/gpt-5) - If the primary model fails or is unavailable, it automatically falls back to the models listed in
extra_body.models - Fallback models are sorted according to the
sortcriteria - Models or providers listed in
ignoreare excluded from fallbacks
Provider Routing
You can specify a model without a provider prefix (e.g.,gpt-oss-120b instead of groq/gpt-oss-120b), and the API will automatically select a provider for you. Use the provider parameter in extra_body to control how providers are selected.
By default, the provider with the lowest latency is selected, and if it fails, the next best provider is tried automatically.
To see which models support provider routing (i.e., are available from multiple providers), use the List Models endpoint.
Provider configuration
| Field | Type | Default | Description |
|---|---|---|---|
order | string[] | — | Explicit list of providers to try, in order. When specified, providers are tried in this exact order. |
allow_fallbacks | boolean | true | Whether to fall back to the next provider if the current one fails. |
How provider selection works
When noprovider.order is specified, the sort criteria determines the order providers are tried. If no sort is specified either, providers are ordered by latency (fastest first).
When provider.order is specified, providers are tried in the exact order listed — sort does not apply to the provider order (but still applies to models fallbacks if specified).
The ignore field applies to providers regardless of whether order is specified.
Examples
Execution order
When using provider routing with model fallbacks, the full execution order is:- Try providers for the primary model — in
provider.orderorder (if specified) or sorted bysortcriteria (default: latency) - If all providers fail and
modelsis specified — fall back to the models list, sorted bysortcriteria - If all models fail — return an error
Use Cases
- Reliability: Ensure your application continues working even if a specific model or provider is down
- Cost Optimization: Route to cheaper providers or fall back to cheaper models
- Performance: Prefer low-latency providers, or fall back to faster models for time-sensitive requests
- Provider Control: Lock to specific providers for compliance or consistency