Skip to main content

Overview

API outages are the nightmare of “AI-first” products. A Router acts as a high-availability load balancer, automatically routing requests to backup providers when your primary provider experiences issues.

The Problem

When your application depends on a single AI provider, any outage becomes your outage:
  • 429 Rate Limit errors → Your application stops working
  • 5xx Server Errors → Users see failures
  • Provider downtime → Complete service disruption
For AI-first products, this means lost revenue, frustrated users, and damaged reputation.

The Solution

A failover router automatically detects failures and instantly re-routes requests to backup providers. Your application stays online even if a major AI provider goes dark, providing 99.9% uptime for your AI features.

How It Works

  1. Request arrives at your Inworld Router endpoint
  2. Inworld Router attempts to call Provider A (e.g., OpenAI)
  3. If Provider A fails (429, 5xx, timeout), router automatically tries Provider B
  4. If Provider B fails, router tries Provider C
  5. Response is returned from the first available provider

Implementation

Step 1: Create a Failover Router

Create a router with a primary model and fallbacks. Since failover applies to all requests unconditionally, use defaultRoute directly — no conditional routes or CEL expressions needed:
curl --request POST \
  --url https://api.inworld.ai/router/v1/routers \
  --header 'Authorization: Bearer <your-api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
    "name": "routers/failover-system",
    "defaults": {
      "text_generation_config": {
        "max_new_tokens": 1024,
        "temperature": 0.7
      }
    },
    "defaultRoute": {
      "route_id": "failover",
      "variants": [
        {
          "variant": {
            "variant_id": "primary-with-fallbacks",
            "model_id": "openai/gpt-5",
            "model_selection": {
              "models": [
                "anthropic/claude-opus-4-6",
                "google-ai-studio/gemini-2.5-flash"
              ]
            }
          },
          "weight": 100
        }
      ]
    }
  }'
The failover router uses defaultRoute with model_selection to specify fallback models. Since failover applies unconditionally to all requests, there is no need for conditional routes or CEL expressions — defaultRoute is the right choice. If the primary model (openai/gpt-5) fails, Inworld Router automatically tries the fallback models (anthropic/claude-opus-4-6, then google-ai-studio/gemini-2.5-flash) in order.

Step 2: Configure Automatic Failover

When using a specific model with fallbacks, Inworld Router automatically handles failover:
curl --location 'https://api.inworld.ai/v1/chat/completions' \
--header 'Authorization: Bearer <your-api-key>' \
--header 'Content-Type: application/json' \
--data '{
  "model": "openai/gpt-5",
  "messages": [
    {"role": "user", "content": "What is the capital of France?"}
  ],
  "extra_body": {
    "models": [
      "anthropic/claude-opus-4-6",
      "google-ai-studio/gemini-2.5-flash"
    ]
  }
}'
If OpenAI returns a 429 or 5xx error, Inworld Router automatically retries with Claude, then Gemini if needed.

Step 3: Use Router-Based Failover

For more control, use your failover router:
curl --location 'https://api.inworld.ai/v1/chat/completions' \
--header 'Authorization: Bearer <your-api-key>' \
--header 'Content-Type: application/json' \
--data '{
  "model": "inworld/failover-system",
  "messages": [
    {"role": "user", "content": "Explain quantum computing in simple terms"}
  ]
}'

Failover Scenarios

Scenario 1: Rate Limit (429)

When you hit rate limits, Inworld Router automatically routes to the next provider:
Request → OpenAI (429 Rate Limit) → Anthropic (Success) ✅

Scenario 2: Server Error (5xx)

When a provider experiences server errors, failover kicks in:
Request → OpenAI (503 Service Unavailable) → Google (Success) ✅

Scenario 3: Timeout

If a provider doesn’t respond in time, Inworld Router moves to the next:
Request → OpenAI (Timeout) → Anthropic (Success) ✅

Scenario 4: Complete Provider Outage

Even if an entire provider goes down, your application continues working:
Request → OpenAI (Complete Outage) → Anthropic (Success) ✅

Multi-Provider Architecture

For maximum resilience, configure failover across 3+ providers:
{
  "defaultRoute": {
    "route_id": "failover",
    "variants": [
      {
        "variant": {
          "variant_id": "primary-with-fallbacks",
          "model_id": "openai/gpt-5",
          "model_selection": {
            "models": [
              "anthropic/claude-opus-4-6",
              "google-ai-studio/gemini-2.5-flash",
              "groq/llama-3.3-70b-versatile"
            ]
          }
        },
        "weight": 100
      }
    ]
  }
}

Monitoring Failover Events

Track failover events in the response metadata:
{
  "id": "chatcmpl-...",
  "model": "anthropic/claude-opus-4-6",
  "choices": [...],
  "metadata": {
    "attempts": [
      {
        "provider": "openai",
        "model": "gpt-5",
        "status": "failed",
        "error": "429 Rate Limit"
      },
      {
        "provider": "anthropic",
        "model": "claude-opus-4-6",
        "status": "success"
      }
    ]
  }
}

Best Practices

  1. Diversify providers: Don’t rely on providers from the same infrastructure (e.g., both using AWS)
  2. Monitor failover rates: High failover rates may indicate you need to adjust rate limits or add capacity
  3. Test regularly: Periodically test failover scenarios to ensure they work as expected
  4. Set timeouts: Configure appropriate timeout values to avoid long waits before failover
  5. Log everything: Track all failover events for debugging and optimization

Cost Considerations

Failover routing can help with cost optimization:
  • Primary: Use your preferred model (e.g., GPT-5)
  • Fallback: Use cost-effective alternatives (e.g., Gemini 2.5 Flash, Claude 3.5 Haiku)
  • Emergency: Use self-hosted models for critical paths

Advanced: Health Checks

For production systems, implement health checks to proactively route away from unhealthy providers:
{
  "defaultRoute": {
    "route_id": "failover-with-health",
    "variants": [
      {
        "variant": {
          "variant_id": "primary-openai",
          "model_id": "openai/gpt-5",
          "model_selection": {
            "models": [
              "anthropic/claude-opus-4-6",
              "google-ai-studio/gemini-2.5-flash"
            ]
          }
        },
        "weight": 100
      }
    ]
  }
}
Health checks are automatically handled by Inworld Router. The router continuously monitors provider health and automatically routes away from unhealthy providers. You don’t need to explicitly configure health checks in the router configuration.

Next Steps