gpt-oss-120b instead of groq/gpt-oss-120b), and the API will automatically select a provider for you. Optionally, use the model_selection.provider field in your router config to control how providers are selected.
By default, the provider with the lowest latency is selected, and if it fails, the next best provider is tried automatically.
To see which models are available from multiple providers, use the List Models endpoint.
Provider configuration
| Field | Type | Default | Description |
|---|---|---|---|
order | string[] | — | Explicit list of providers to try, in order. When specified, providers are tried in this exact order. |
allow_fallbacks | boolean | true | Whether to fall back to the next provider if the first one fails. |
How provider selection works
When noprovider.order is specified, the sort criteria determines the order providers are tried. If no sort is specified either, providers are ordered by latency (fastest first).
When provider.order is specified, providers are tried in the exact order listed — sort does not apply to the provider order (but still applies to models fallbacks if specified).
The ignore field applies to providers regardless of whether order is specified.
Examples
Execution order
When using provider routing with model fallbacks, the full execution order is:- Try providers for the primary model — in
provider.orderorder (if specified) or sorted bysortcriteria (default: latency) - If all providers fail and
modelsis specified — fall back to the models list, sorted bysortcriteria - If all models fail — return an error
Use Cases
- Reliability: Ensure your application continues working even if a specific provider is down
- Cost Optimization: Route to cheaper providers or fall back to cheaper models
- Performance: Prefer low-latency providers, or fall back to faster models for time-sensitive requests
- Provider Control: Lock to specific providers for compliance or consistency