> ## Documentation Index
> Fetch the complete documentation index at: https://dev.docs.inworld.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# BigQuery

> Export router data to BigQuery and import your product events

export const BigQuerySetupGenerator = () => {
  const [projectId, setProjectId] = useState("");
  const [datasetId, setDatasetId] = useState("inworld_router_dataset");
  const [serviceAccountName, setServiceAccountName] = useState("inworld-sync");
  const [location, setLocation] = useState("US");
  const [mode, setMode] = useState("both");
  const [copiedSection, setCopiedSection] = useState("");
  const trimmedProjectId = projectId.trim();
  const trimmedDatasetId = datasetId.trim();
  const trimmedServiceAccountName = serviceAccountName.trim() || "inworld-sync";
  const trimmedLocation = location.trim() || "US";
  const serviceAccountEmail = trimmedProjectId ? `${trimmedServiceAccountName}@${trimmedProjectId}.iam.gserviceaccount.com` : `${trimmedServiceAccountName}@<project-id>.iam.gserviceaccount.com`;
  const needsReadSessionRole = mode === "import" || mode === "both";
  const needsDatasetEditor = mode === "export" || mode === "both";
  const needsDatasetViewer = mode === "import";
  const canGenerateCommands = trimmedProjectId.length > 0;
  const sections = [];
  if (trimmedDatasetId) {
    sections.push({
      id: "dataset",
      title: "1. Create the dataset",
      body: `bq --location=${trimmedLocation} mk --dataset ${trimmedProjectId}:${trimmedDatasetId}`
    });
  }
  sections.push({
    id: "service-account",
    title: trimmedDatasetId ? "2. Create the service account" : "1. Create the service account",
    body: `gcloud iam service-accounts create ${trimmedServiceAccountName} --project=${trimmedProjectId}`
  });
  const projectRoleCommands = [`gcloud projects add-iam-policy-binding ${trimmedProjectId} --member="serviceAccount:${serviceAccountEmail}" --role="roles/bigquery.jobUser"`];
  if (needsReadSessionRole) {
    projectRoleCommands.push(`gcloud projects add-iam-policy-binding ${trimmedProjectId} --member="serviceAccount:${serviceAccountEmail}" --role="roles/bigquery.readSessionUser"`);
  }
  sections.push({
    id: "project-roles",
    title: trimmedDatasetId ? "3. Grant project-level roles" : "2. Grant project-level roles",
    body: projectRoleCommands.join("\n")
  });
  sections.push({
    id: "impersonation",
    title: trimmedDatasetId ? "4. Grant Inworld impersonation access" : "3. Grant Inworld impersonation access",
    body: `gcloud iam service-accounts add-iam-policy-binding ${serviceAccountEmail} --project=${trimmedProjectId} --member="serviceAccount:inworld-sync@inworld-ai-prod.iam.gserviceaccount.com" --role="roles/iam.serviceAccountTokenCreator"`
  });
  if (trimmedDatasetId) {
    const datasetGrantCommands = [];
    if (needsDatasetEditor) {
      datasetGrantCommands.push(`bq query --project_id=${trimmedProjectId} --use_legacy_sql=false 'GRANT \`roles/bigquery.dataEditor\` ON SCHEMA \`${trimmedProjectId}.${trimmedDatasetId}\` TO "serviceAccount:${serviceAccountEmail}"'`);
    }
    if (needsDatasetViewer) {
      datasetGrantCommands.push(`bq query --project_id=${trimmedProjectId} --use_legacy_sql=false 'GRANT \`roles/bigquery.dataViewer\` ON SCHEMA \`${trimmedProjectId}.${trimmedDatasetId}\` TO "serviceAccount:${serviceAccountEmail}"'`);
    }
    sections.push({
      id: "dataset-permissions",
      title: "5. Grant dataset permissions",
      body: datasetGrantCommands.join("\n")
    });
  }
  const copyToClipboard = async (sectionId, value) => {
    try {
      await navigator.clipboard.writeText(value);
      setCopiedSection(sectionId);
      window.setTimeout(() => {
        setCopiedSection(current => current === sectionId ? "" : current);
      }, 1500);
    } catch (error) {
      console.error("Failed to copy setup commands", error);
    }
  };
  return <div className="not-prose rounded-2xl border border-zinc-950/10 bg-gradient-to-b from-zinc-50 to-white p-5 dark:border-white/10 dark:from-zinc-900 dark:to-zinc-950">
      <div className="mb-6">
        <p className="mb-2 text-sm font-semibold uppercase tracking-[0.2em] text-zinc-500 dark:text-zinc-400">
          Bash setup generator
        </p>
        <h3 className="mb-2 text-xl font-semibold text-zinc-950 dark:text-white">
          Generate your BigQuery setup commands
        </h3>
        <p className="text-sm leading-6 text-zinc-600 dark:text-zinc-300">
          Fill in your project details to generate the exact commands for export,
          import, or both. The defaults use dataset{" "}
          <code className="rounded bg-zinc-950/5 px-1 py-0.5 text-[0.85em] dark:bg-white/10">
            inworld_router_dataset
          </code>{" "}
          in location{" "}
          <code className="rounded bg-zinc-950/5 px-1 py-0.5 text-[0.85em] dark:bg-white/10">
            US
          </code>
          . BigQuery dataset IDs must use letters, numbers, and underscores.
        </p>
      </div>

      <div className="grid gap-4 md:grid-cols-2">
        <label className="block">
          <span className="mb-2 block text-sm font-medium text-zinc-900 dark:text-zinc-100">
            GCP project ID
          </span>
          <input type="text" value={projectId} onChange={event => setProjectId(event.target.value)} placeholder="my-gcp-project" className="w-full rounded-xl border border-zinc-300 bg-white px-3 py-2 text-sm text-zinc-950 outline-none transition focus:border-zinc-500 dark:border-zinc-700 dark:bg-zinc-900 dark:text-white dark:focus:border-zinc-400" />
        </label>

        <label className="block">
          <span className="mb-2 block text-sm font-medium text-zinc-900 dark:text-zinc-100">
            Setup mode
          </span>
          <select value={mode} onChange={event => setMode(event.target.value)} className="w-full rounded-xl border border-zinc-300 bg-white px-3 py-2 text-sm text-zinc-950 outline-none transition focus:border-zinc-500 dark:border-zinc-700 dark:bg-zinc-900 dark:text-white dark:focus:border-zinc-400">
            <option value="export">Export</option>
            <option value="import">Import</option>
            <option value="both">Both</option>
          </select>
        </label>

        <label className="block">
          <span className="mb-2 block text-sm font-medium text-zinc-900 dark:text-zinc-100">
            Dataset ID
          </span>
          <input type="text" value={datasetId} onChange={event => setDatasetId(event.target.value)} placeholder="inworld_router_dataset" className="w-full rounded-xl border border-zinc-300 bg-white px-3 py-2 text-sm text-zinc-950 outline-none transition focus:border-zinc-500 dark:border-zinc-700 dark:bg-zinc-900 dark:text-white dark:focus:border-zinc-400" />
        </label>

        <label className="block">
          <span className="mb-2 block text-sm font-medium text-zinc-900 dark:text-zinc-100">
            BigQuery location
          </span>
          <input type="text" value={location} onChange={event => setLocation(event.target.value)} placeholder="US" className="w-full rounded-xl border border-zinc-300 bg-white px-3 py-2 text-sm text-zinc-950 outline-none transition focus:border-zinc-500 dark:border-zinc-700 dark:bg-zinc-900 dark:text-white dark:focus:border-zinc-400" />
        </label>

        <label className="block md:col-span-2">
          <span className="mb-2 block text-sm font-medium text-zinc-900 dark:text-zinc-100">
            Service account name
          </span>
          <input type="text" value={serviceAccountName} onChange={event => setServiceAccountName(event.target.value)} placeholder="inworld-sync" className="w-full rounded-xl border border-zinc-300 bg-white px-3 py-2 text-sm text-zinc-950 outline-none transition focus:border-zinc-500 dark:border-zinc-700 dark:bg-zinc-900 dark:text-white dark:focus:border-zinc-400" />
        </label>
      </div>

      <div className="mt-5 rounded-xl border border-zinc-950/10 bg-white/80 p-4 dark:border-white/10 dark:bg-zinc-950/70">
        <div className="flex items-center justify-between gap-3">
          <p className="text-sm font-medium text-zinc-900 dark:text-zinc-100">
            Derived service account email
          </p>
          <button type="button" onClick={() => copyToClipboard("service-account-email", serviceAccountEmail)} className="rounded-lg border border-zinc-300 px-3 py-1.5 text-xs font-medium text-zinc-700 transition hover:bg-zinc-100 dark:border-zinc-700 dark:text-zinc-200 dark:hover:bg-zinc-800">
            {copiedSection === "service-account-email" ? "Copied" : "Copy"}
          </button>
        </div>
        <code className="mt-2 block text-sm text-zinc-700 dark:text-zinc-300">
          {serviceAccountEmail}
        </code>
      </div>

      {!trimmedDatasetId ? <div className="mt-4 rounded-xl border border-amber-300/60 bg-amber-50 px-4 py-3 text-sm leading-6 text-amber-950 dark:border-amber-700/50 dark:bg-amber-950/30 dark:text-amber-100">
          Dataset creation and dataset-level permission commands are hidden because
          the dataset field is empty. Leave it blank only if your dataset already
          exists and you plan to configure dataset access separately.
        </div> : null}

      {!canGenerateCommands ? <div className="mt-4 rounded-xl border border-zinc-950/10 bg-zinc-100 px-4 py-3 text-sm leading-6 text-zinc-700 dark:border-white/10 dark:bg-zinc-900 dark:text-zinc-300">
          Enter your GCP project ID to generate runnable commands.
        </div> : <div className="mt-6 space-y-4">
          {sections.map(section => <div key={section.id} className="overflow-hidden rounded-xl border border-zinc-950/10 bg-zinc-950 dark:border-white/10">
              <div className="flex items-center justify-between gap-3 border-b border-white/10 px-4 py-3">
                <p className="text-sm font-medium text-white">{section.title}</p>
                <button type="button" onClick={() => copyToClipboard(section.id, section.body)} className="rounded-lg border border-white/15 px-3 py-1.5 text-xs font-medium text-white transition hover:bg-white/10">
                  {copiedSection === section.id ? "Copied" : "Copy"}
                </button>
              </div>
              <pre className="overflow-x-auto p-4 text-sm leading-6 text-zinc-100">
                <code>{section.body}</code>
              </pre>
            </div>)}
        </div>}
    </div>;
};

Connect your [BigQuery](https://cloud.google.com/bigquery) project to export router exposure data for analysis and import your product events to define custom metrics in Portal.

## Prerequisites

### CLI setup generator

Use the generator below to produce the `gcloud` and `bq` commands for your setup. It defaults to dataset ID `inworld_router_dataset`, location `US`, service account name `inworld-sync`, and generates only the roles required for the mode you select.

<Note>
  Install and authenticate the Google Cloud CLI and `bq` before running these commands. If you prefer the Google Cloud console, use the manual flow in the next section.
</Note>

<Note>
  BigQuery dataset IDs can contain only letters, numbers, and underscores. For dataset location, use a valid BigQuery region or multi-region such as `US`, `EU`, or `us-central1`.
</Note>

<Note>
  IAM changes can take a few minutes to propagate across Google Cloud. If Portal fails its initial validation right after you grant permissions, wait a few minutes and try again before assuming the setup is incorrect.
</Note>

<BigQuerySetupGenerator />

### Console setup (alternative)

<Steps titleSize="h3">
  <Step title="Create a BigQuery dataset">
    Create a BigQuery dataset in your GCP project to store router data. The CLI generator above defaults to dataset ID `inworld_router_dataset` in location `US`. If you only plan to import events and already have a source dataset, you can skip creating a new dataset.
  </Step>

  <Step title="Create a service account">
    Go to the [Service Accounts page](https://console.cloud.google.com/iam-admin/serviceaccounts) in your GCP project and create a new service account for the Inworld sync job. On the permissions page, grant it the following project-level roles:

    * **BigQuery Job User** (`roles/bigquery.jobUser`) — allows the account to run BigQuery jobs
    * **BigQuery Read Session User** (`roles/bigquery.readSessionUser`) — required for importing events via the BigQuery Storage Read API. You can skip this role if you only plan to export data.
  </Step>

  <Step title="Grant Inworld impersonation access">
    Navigate to your newly created service account and open the **Principals with access** tab. Click **Grant access** and add the following principal:

    ```
    inworld-sync@inworld-ai-prod.iam.gserviceaccount.com
    ```

    Assign it the **Service Account Token Creator** role (`roles/iam.serviceAccountTokenCreator`). This allows the Inworld sync service to impersonate your service account for BigQuery operations.
  </Step>

  <Step title="Grant BigQuery dataset permissions">
    Go to [BigQuery](https://console.cloud.google.com/bigquery) and select the dataset you want to use with Inworld. On the dataset page, go to **Share → Manage Permissions** and click **Add principal**. Enter the email of the service account you created in Step 2 and assign the appropriate role:

    * **BigQuery Data Editor** — required for exporting router exposure data to your dataset
    * **BigQuery Data Viewer** — required for importing events from your source tables

    If you plan to use both export and import on the same dataset, **BigQuery Data Editor** is sufficient (it includes viewer permissions). Repeat this step for each dataset you plan to use. The CLI generator applies the same rule and only emits **BigQuery Data Editor** for the `Both` mode.
  </Step>

  <Step title="Configure the integration in Portal">
    <img src="https://mintcdn.com/inworldaidev/Xfq08e_91GsQXodV/img/router/connect-bigquery.png?fit=max&auto=format&n=Xfq08e_91GsQXodV&q=85&s=5a1718485b40e4349d065a2dffafe6ed" alt="" width="1480" height="1556" data-path="img/router/connect-bigquery.png" />

    In [Portal](https://portal.inworld.ai), go to **Routers** and click **Set Up Metrics** to open the **Data Integrations** page, then click **Add Integration** and select **BigQuery**. Enter the following:

    | Field                     | Value                                                  |
    | :------------------------ | :----------------------------------------------------- |
    | **Project ID**            | Your GCP project ID                                    |
    | **Dataset ID**            | Your BigQuery dataset ID from Step 1 (used for export) |
    | **Service Account Email** | The service account email from Step 2                  |
    | **BigQuery Location**     | The region where your dataset is located               |

    Click **Enable** to save your credentials. You can then toggle export and import independently in the **Data Sync** section.
  </Step>
</Steps>

## Exporting to BigQuery

Toggle **Export to BigQuery** on in the Data Sync section of your integration settings. When enabled, router exposure data is automatically written to your BigQuery dataset.

### Make a request to your router

Call the [Chat Completions API](/api-reference/routerAPI/chat-completions) with your router, populating a unique user ID in the `user` field:

```shell theme={"system"}
curl --request POST \
  --url https://api.inworld.ai/v1/chat/completions \
  --header 'Authorization: Basic <your-api-key>' \
  --header 'Content-Type: application/json' \
  --data '{
    "model": "inworld/my-router",
    "messages": [{"role": "user", "content": "Hello"}],
    "user": "<your-user-id>"
  }'
```

The router exposure data will include the `user` in the data export, so you can tie the specific request to the right user.

### Analyze your data

The router exposure data will flow to your BigQuery dataset automatically (data syncs every \~8 hours). You will see a row for each router, route, and variant that a user was exposed to every hour.

Each row will include:

* `alias_id` — Your router's ID
* `route_id` — The route that the user was routed to
* `variant_id` — The variant that the user was exposed to
* `user_id` — The user that was exposed to this router, route, and variant
* `timestamp` — The time of the most recent event in the hourly bucket in which the user was exposed to this router, route, and variant

You can use this data in conjunction with your existing metrics to understand how your metrics trend based on the router, route, or variant that the user was exposed to.

## Importing from BigQuery

Toggle **Import from BigQuery** on in the Data Sync section to start pulling your product events into Portal.

### Define events to import

In the **Events to import** section that appears after enabling import, click **Add Event**. For each event, you map columns from a BigQuery table to the fields Inworld expects:

| Field                     | Description                                           | Example                                  |
| :------------------------ | :---------------------------------------------------- | :--------------------------------------- |
| **Event Name**            | A logical name for this event in Portal.              | `user_donation`                          |
| **Description**           | Optional description of when this event is triggered. | `Fires when a user completes a donation` |
| **Dataset**               | The BigQuery dataset containing the source table.     | `user_data`                              |
| **Table Name**            | The BigQuery table to read events from.               | `donations`                              |
| **Event Field Name**      | The column that identifies the event type.            | `event_name`                             |
| **User ID Field Name**    | The column containing the user identifier.            | `user_id`                                |
| **Timestamp Field Name**  | The column with the event timestamp.                  | `created_at`                             |
| **Event UUID Field Name** | The column with a unique identifier per event row.    | `event_uuid`                             |

<Note>
  Event names can only contain letters, numbers, underscores, and hyphens.
</Note>

You can add multiple events, each pointing to a different table or filtering on a different event type within the same table.

### Schema requirements

Your BigQuery source table must meet these requirements:

| Column           | Type                      | Notes                                                                                                     |
| :--------------- | :------------------------ | :-------------------------------------------------------------------------------------------------------- |
| Timestamp field  | `TIMESTAMP` or `DATETIME` | Used to determine sync boundaries and event ordering                                                      |
| User ID field    | `STRING`                  | Must match the `user` value passed in Chat Completions API requests for correlation with router exposures |
| Event UUID field | `STRING`                  | Must be unique per row to ensure idempotent imports                                                       |

Any additional columns in the table are treated as event properties. Each property must be its own column with a native BigQuery type — for example, `user_message_count INT64`, `session_duration FLOAT64`, or `plan_name STRING`. Properties are automatically classified by type (string, number, boolean) during import so they can be used in metric definitions.

<Warning>
  Storing properties as JSON-encoded maps (e.g., a single `STRING` column containing `{"amount": 29.99, "count": 3}`) is **not supported**. The import pipeline reads each column value as-is, so JSON strings will be imported as plain strings rather than being parsed into individual properties.
</Warning>

The service account configured in [Prerequisites](#prerequisites) must have **BigQuery Data Viewer** access to the specified dataset and table.

### How import works

* Inworld reads from your BigQuery tables using the BigQuery Storage Read API for efficient data transfer.
* Events are synced approximately every **8 hours**. Imports are incremental — only new rows since the last sync are fetched.
* User identity is matched using the User ID field. To correlate imported events with router exposures, ensure the same user identifier is passed as the `user` field in your Chat Completions API requests and in the User ID column of your BigQuery table.
* Once events are imported, their numeric properties are automatically discovered and become available for metric configuration.

### Next steps

Once your events are importing, [create custom metrics](/router/data-integrations/custom-metrics) to track how routing decisions impact your KPIs.
