Use the Braintrust gateway

The Braintrust gateway provides a unified API to access LLM models from OpenAI, Anthropic, Google, AWS, and other providers. Point your SDKs to the gateway URL and immediately get automatic caching, observability, and multi-provider support. Use any supported provider’s SDK to call any provider’s models—standardize on one SDK while accessing all available models.

The Braintrust-hosted gateway (https://gateway.braintrust.dev/v1) is currently in beta and free to use. It is designed for production workloads, and uptime is tracked on the Braintrust status page under AI Gateway. Pricing will be announced before general availability.

Quickstart

Use your Braintrust API key to access all AI providers through one interface.

OpenAI
Anthropic
Gemini
cURL

Set the base URL to https://gateway.braintrust.dev (no /v1) for the Gemini SDK.

import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://gateway.braintrust.dev/v1",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

async function main() {
  const response = await client.responses.create({
    model: "gpt-5-nano",
    input: [
      { role: "user", content: "Say hello!" },
    ],
  });

  console.log(response.output_text);
}

main();

Set the base URL to https://gateway.braintrust.dev (without /v1 — the SDK adds its own path prefix). The anthropic-version and x-api-key headers are not required when using a Braintrust API key.

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  baseURL: "https://gateway.braintrust.dev",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

async function main() {
  const response = await client.messages.create({
    model: "claude-haiku-4-5",
    messages: [{ role: "user", content: "Say hello!" }],
    max_tokens: 50,
  });
  console.log(response.content[0].type === "text" ? response.content[0].text : "");
}

main();

Set the base URL to https://gateway.braintrust.dev (no /v1) when using the Gemini SDK.

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({
  httpOptions: {
    baseUrl: "https://gateway.braintrust.dev",
    headers: {
      "x-api-key": process.env.BRAINTRUST_API_KEY,
    },
  },
});

async function main() {
  const response = await client.models.generateContent({
    model: "gemini-2.5-flash",
    contents: "Say hello!",
  });
  console.log(response.text);
}

main();

Use the OpenAI-compatible endpoint directly.

curl -X POST https://gateway.braintrust.dev/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $BRAINTRUST_API_KEY" \
  -d '{
    "model": "gpt-5-mini",
    "messages": [{"role": "user", "content": "Say hello!"}]
  }'

Use any SDK with any provider.

The gateway also supports using any supported integration’s SDK to call models from any provider. This means you can standardize on one SDK across your codebase while accessing all available models.

OpenAI SDK → Claude
Anthropic SDK → Gemini
Gemini SDK → GPT

Use the OpenAI SDK to call Anthropic’s Claude models.

import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://gateway.braintrust.dev/v1",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

// Call Anthropic's Claude using the OpenAI SDK
const response = await client.chat.completions.create({
  model: "claude-sonnet-4-5",
  messages: [{ role: "user", content: "Hello!" }],
});
console.log(response.choices[0].message.content);

Use the Anthropic SDK to call Google’s Gemini models.

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  baseURL: "https://gateway.braintrust.dev",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

// Call Google's Gemini using the Anthropic SDK
const response = await client.messages.create({
  model: "gemini-2.5-flash",
  messages: [{ role: "user", content: "Hello!" }],
  max_tokens: 50,
});
console.log(response.content[0].type === "text" ? response.content[0].text : "");

Use the Gemini SDK to call OpenAI’s GPT models.

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({
  httpOptions: {
    baseUrl: "https://gateway.braintrust.dev",
    headers: {
      "x-api-key": process.env.BRAINTRUST_API_KEY,
    },
  },
});

// Call OpenAI's GPT using the Gemini SDK
const response = await client.models.generateContent({
  model: "gpt-5-mini",
  contents: "Hello!",
});
console.log(response.text);

Configure API keys

Configure two things for gateway requests: a Braintrust auth token to call the gateway, and AI provider keys that the gateway uses to run model requests.

Create a Braintrust auth token

Set BRAINTRUST_API_KEY to a Braintrust auth token and pass it in Authorization: Bearer ... when calling the gateway. This value can be either a user API key (sk-) or a service token (bt-st-).

Use a user API key (sk-) for personal development workflows tied to your user account. Create one in organization API keys.
Use a service token (bt-st-) for CI/CD pipelines, backend services, and shared automation. Create one in service tokens.

Add AI provider keys

Organization-level AI providers Add provider API keys in your organization settings under AI providers. These keys are available across all projects and act as the default credentials for gateway requests.
Project-level AI providers Configure provider API keys at the project level when a project needs separate billing, usage isolation, or different credentials. Project-level keys override organization defaults for requests made in that project’s context. You can also set up providers inline when running playgrounds or prompts. See Use project-level AI providers for how to specify a project when making gateway requests.

Supported providers

Standard providers include:

OpenAI (GPT-5.2, GPT-5-mini, GPT-5-nano, etc.)
Anthropic (Claude 4.6 Sonnet, Claude 4.6 Opus, Claude 4.5, etc.)
Google (Gemini 3.0 Flash, Gemini 3.1 Pro, etc.)
AWS Bedrock (Claude, Llama, Mistral models)
Vertex AI (Gemini, Claude models)
Azure OpenAI (GPT-5.2, GPT-5-mini, GPT-5-nano, etc.)
Additional providers (Baseten, Fireworks, Groq, Mistral, etc.)

If you need a model that isn’t supported, let us know.

Custom providers

Add custom models or endpoints to use with the gateway. Custom providers support self-hosted models, fine-tuned models, and proprietary AI services. See Custom providers for setup instructions and configuration options. Once you’ve configured a custom provider, call it immediately through the same OpenAI-compatible client using your custom model name.

import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://gateway.braintrust.dev/v1",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

const response = await client.chat.completions.create({
  model: "my-custom-model",
  messages: [{ role: "user", content: "Write a haiku about gateways." }],
});
console.log(response.choices[0].message.content);

Enable logging

Log gateway requests to a Braintrust project by setting x-bt-parent with logger.export().

import { initLogger } from "braintrust";
import { OpenAI } from "openai";

const logger = initLogger({ projectName: "My Project" });

const client = new OpenAI({
  baseURL: "https://gateway.braintrust.dev/v1",
  defaultHeaders: {
    "x-bt-parent": logger.export(),
  },
  apiKey: process.env.BRAINTRUST_API_KEY,
});

const response = await client.responses.create({
  model: "gpt-5-mini",
  input: [{ role: "user", content: "Say hello!" }],
});

console.log(response.output_text);

For distributed tracing across services, use span.export() to pass trace context and nest gateway calls under a parent span. See distributed tracing for details. You can also set x-bt-parent directly with a prefix like project_id:, project_name:, or experiment_id: if you don’t need full tracing.

Enable caching

The gateway automatically caches results and reuses them when possible. This is especially useful when developing and frequently re-running or evaluating the same prompts.

This section describes gateway response caching (controlled by x-bt-use-cache, x-bt-cache-ttl, and Cache-Control). This is separate from provider-side caching features such as prompt caching (for example, Anthropic’s cache_control) or any caching a provider may perform automatically.

If the gateway serves a response from its cache, it returns the cached response without contacting the provider, so provider-side caching does not apply for that request.
If the gateway forwards the request to the provider, you can still use provider-specific caching parameters/headers. The gateway does not translate provider caching settings across providers.

If you want to rely on provider-side caching while disabling gateway response caching, set x-bt-use-cache: never.

Cache modes

There are three caching modes: auto (default), always, never:

In auto mode, requests are cached if they have temperature=0 or the seed parameter set and they are one of the supported paths.
In always mode, requests are cached as long as they are one of the supported paths.
In never mode, the cache is never read or written to.

The supported paths are:

/auto
/embeddings
/chat/completions
/completions
/moderations

Set the cache mode by passing the x-bt-use-cache header:

import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://gateway.braintrust.dev/v1",
  defaultHeaders: {
    "x-bt-use-cache": "always",
  },
  apiKey: process.env.BRAINTRUST_API_KEY,
});

The response includes x-bt-cached: HIT or MISS to indicate cache status.

Cache TTL

By default, cached results expire after 1 week. Set the TTL for individual requests by passing the x-bt-cache-ttl header. The TTL is specified in seconds and must be between 1 and 604800 (7 days).

Cache control

The gateway supports a limited set of Cache-Control directives:

To bypass the cache, set the Cache-Control header to no-cache, no-store. This is semantically equivalent to setting the x-bt-use-cache header to never.
To force a fresh request, set the Cache-Control header to no-cache. Without the no-store directive, the response will be cached for subsequent requests.
To request a cached response with a maximum age, set the Cache-Control header to max-age=<seconds>. If the cached data is older than the specified age, the cache will be bypassed and a new response will be generated. Combine this with no-store to bypass the cache for a request without overwriting the current cached response.

When cache control directives conflict with the x-bt-use-cache header, the cache control directives take precedence. The gateway returns the x-bt-cached header in the response with HIT or MISS to indicate whether the response was served from the cache, the Age header to indicate the age of the cached response, and the Cache-Control header with the max-age directive to return the TTL of the cached response. For example, to set the cache mode to always with a TTL of 2 days:

import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://gateway.braintrust.dev/v1",
  defaultHeaders: {
    "x-bt-use-cache": "always",
    "Cache-Control": "max-age=172800",
  },
  apiKey: process.env.BRAINTRUST_API_KEY,
});

async function main() {
  const response = await client.chat.completions.create({
    model: "gpt-5-mini",
    messages: [{ role: "user", content: "Say hello!" }],
  });
  console.log(response.choices[0].message.content);
}

main();

Cache encryption

The gateway uses AES-GCM to encrypt the cache, using a key derived from your API key. Results are cached for 1 week unless otherwise specified in request headers. This design ensures that the cache is only accessible to you. Braintrust cannot see your data and does not store or log API keys.

Because the cache’s encryption key is your API key, cached results are scoped to an individual user. Braintrust customers can opt into sharing cached results across users within their organization.

Specify an organization

If you’re part of multiple organizations, specify which to use with the x-bt-org-name header:

import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://gateway.braintrust.dev/v1",
  defaultHeaders: {
    "x-bt-org-name": "Acme Inc", // Replace with your organization name
  },
  apiKey: process.env.BRAINTRUST_API_KEY,
});

Use project-level AI providers

When making gateway requests, your Braintrust API key identifies who is making the request. By default, the gateway uses your organization-level AI provider credentials. To use AI provider credentials configured at the project level, specify the project with the x-bt-project-id header. See Configure API keys for where to set organization-level vs project-level provider keys. To find your project ID, navigate to your project’s configuration page and use the Copy Project ID button at the bottom of the page.

import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://gateway.braintrust.dev/v1",
  defaultHeaders: {
    "x-bt-project-id": "PROJECT_ID", // Replace with your project ID
  },
  apiKey: process.env.BRAINTRUST_API_KEY,
});

If you’re part of multiple organizations, you may also need to specify the organization using x-bt-org-name (see Specify an organization). You can send both headers in the same request:

import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://gateway.braintrust.dev/v1",
  defaultHeaders: {
    "x-bt-project-id": "PROJECT_ID", // Replace with your project ID
    "x-bt-org-name": "Acme Inc", // Replace with your organization name
  },
  apiKey: process.env.BRAINTRUST_API_KEY,
});

This ensures the gateway uses the AI provider credentials configured for the specified project rather than the organization defaults.

Advanced configuration

Configure gateway behavior with these headers:

x-bt-use-cache: auto | always | never - Control caching behavior
x-bt-cache-ttl: Seconds (max 604800) - Set cache TTL
x-bt-org-name: Organization name - Specify organization for multi-org users
x-bt-project-id: Project ID - Use project-level AI provider credentials
x-bt-endpoint-name: Endpoint name - Use a specific configured endpoint
x-bt-parent: Project/experiment/span - Enable logging to Braintrust
x-bt-span-metadata: JSON string - Attach metadata to logged spans

Monitor gateway usage

Track gateway usage across your organization:

Create a project for gateway logs.
Enable logging by setting the x-bt-parent header when calling the gateway (see Enable logging).
View logs in the Logs page.
Create dashboards to track usage, costs, and errors.

The gateway response includes the x-bt-used-endpoint header, which specifies which of your configured providers was used to complete the request.

Migrate from the AI proxy

If you’re currently using the AI proxy, migrating to the gateway requires only a URL change. All headers, features, and SDK patterns are fully compatible.

Update the base URL from https://api.braintrust.dev/v1/proxy to https://gateway.braintrust.dev/v1.
No other code changes are required.

import { OpenAI } from "openai";

const client = new OpenAI({
  // Before:
  // baseURL: "https://api.braintrust.dev/v1/proxy",
  // After:
  baseURL: "https://gateway.braintrust.dev/v1",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

Next steps

Deploy overview for more deployment options
Deploy prompts to call versioned prompts through the gateway
Deploy functions to deploy tools, scorers, and workflows
Monitor deployments to track production performance and errors
Manage environments to separate dev, staging, and production

Start

Instrument

Observe

Annotate

Evaluate

Deploy

Admin

Best practices

Use the Braintrust gateway

Quickstart

Configure API keys

Supported providers

Custom providers

Enable logging

Enable caching

Cache modes

Cache TTL

Cache control

Cache encryption

Specify an organization

Use project-level AI providers

Advanced configuration

Monitor gateway usage

Migrate from the AI proxy

Next steps

Start

Instrument

Observe

Annotate

Evaluate

Deploy

Admin

Best practices

​Quickstart

​Configure API keys

​Supported providers

​Custom providers

​Enable logging

​Enable caching

​Cache modes

​Cache TTL

​Cache control

​Cache encryption

​Specify an organization

​Use project-level AI providers

​Advanced configuration

​Monitor gateway usage

​Migrate from the AI proxy

​Next steps

Quickstart

Configure API keys

Supported providers

Custom providers

Enable logging

Enable caching

Cache modes

Cache TTL

Cache control

Cache encryption

Specify an organization

Use project-level AI providers

Advanced configuration

Monitor gateway usage

Migrate from the AI proxy

Next steps