The Braintrust-hosted gateway (
https://gateway.braintrust.dev/v1) is currently in beta and free to use. It is designed for production workloads, and uptime is tracked on the Braintrust status page under AI Gateway. Pricing will be announced before general availability.Quickstart
Use your Braintrust API key to access all AI providers through one interface.OpenAI
Anthropic
Gemini
- cURL
Set the base URL to
https://gateway.braintrust.dev (no /v1) for the Gemini SDK.OpenAI SDK → Claude
Anthropic SDK → Gemini
Gemini SDK → GPT
Use the OpenAI SDK to call Anthropic’s Claude models.
Configure API keys
Configure two things for gateway requests: a Braintrust auth token to call the gateway, and AI provider keys that the gateway uses to run model requests.Create a Braintrust auth token
Set
BRAINTRUST_API_KEY to a Braintrust auth token and pass it in Authorization: Bearer ... when calling the gateway. This value can be either a user API key (sk-) or a service token (bt-st-).- Use a user API key (
sk-) for personal development workflows tied to your user account. Create one in organization API keys. - Use a service token (
bt-st-) for CI/CD pipelines, backend services, and shared automation. Create one in service tokens.
Add AI provider keys
- Organization-level AI providers Add provider API keys in your organization settings under AI providers. These keys are available across all projects and act as the default credentials for gateway requests.
- Project-level AI providers Configure provider API keys at the project level when a project needs separate billing, usage isolation, or different credentials. Project-level keys override organization defaults for requests made in that project’s context. You can also set up providers inline when running playgrounds or prompts. See Use project-level AI providers for how to specify a project when making gateway requests.
Supported providers
Standard providers include:- OpenAI (GPT-5.2, GPT-5-mini, GPT-5-nano, etc.)
- Anthropic (Claude 4.6 Sonnet, Claude 4.6 Opus, Claude 4.5, etc.)
- Google (Gemini 3.0 Flash, Gemini 3.1 Pro, etc.)
- AWS Bedrock (Claude, Llama, Mistral models)
- Vertex AI (Gemini, Claude models)
- Azure OpenAI (GPT-5.2, GPT-5-mini, GPT-5-nano, etc.)
- Additional providers (Baseten, Fireworks, Groq, Mistral, etc.)
Custom providers
Add custom models or endpoints to use with the gateway. Custom providers support self-hosted models, fine-tuned models, and proprietary AI services. See Custom providers for setup instructions and configuration options. Once you’ve configured a custom provider, call it immediately through the same OpenAI-compatible client using your custom model name.Enable logging
Log gateway requests to a Braintrust project by settingx-bt-parent with logger.export().
span.export() to pass trace context and nest gateway calls under a parent span. See distributed tracing for details.
You can also set x-bt-parent directly with a prefix like project_id:, project_name:, or experiment_id: if you don’t need full tracing.
Enable caching
The gateway automatically caches results and reuses them when possible. This is especially useful when developing and frequently re-running or evaluating the same prompts.This section describes gateway response caching (controlled by
x-bt-use-cache, x-bt-cache-ttl, and Cache-Control). This is separate from provider-side caching features such as prompt caching (for example, Anthropic’s cache_control) or any caching a provider may perform automatically.- If the gateway serves a response from its cache, it returns the cached response without contacting the provider, so provider-side caching does not apply for that request.
- If the gateway forwards the request to the provider, you can still use provider-specific caching parameters/headers. The gateway does not translate provider caching settings across providers.
x-bt-use-cache: never.Cache modes
There are three caching modes:auto (default), always, never:
- In
automode, requests are cached if they havetemperature=0or theseedparameter set and they are one of the supported paths. - In
alwaysmode, requests are cached as long as they are one of the supported paths. - In
nevermode, the cache is never read or written to.
/auto/embeddings/chat/completions/completions/moderations
x-bt-use-cache header:
x-bt-cached: HIT or MISS to indicate cache status.
Cache TTL
By default, cached results expire after 1 week. Set the TTL for individual requests by passing thex-bt-cache-ttl header. The TTL is specified in seconds and must be between 1 and 604800 (7 days).
Cache control
The gateway supports a limited set of Cache-Control directives:- To bypass the cache, set the
Cache-Controlheader tono-cache, no-store. This is semantically equivalent to setting thex-bt-use-cacheheader tonever. - To force a fresh request, set the
Cache-Controlheader tono-cache. Without theno-storedirective, the response will be cached for subsequent requests. - To request a cached response with a maximum age, set the
Cache-Controlheader tomax-age=<seconds>. If the cached data is older than the specified age, the cache will be bypassed and a new response will be generated. Combine this withno-storeto bypass the cache for a request without overwriting the current cached response.
x-bt-use-cache header, the cache control directives take precedence.
The gateway returns the x-bt-cached header in the response with HIT or MISS to indicate whether the response was served from the cache, the Age header to indicate the age of the cached response, and the Cache-Control header with the max-age directive to return the TTL of the cached response.
For example, to set the cache mode to always with a TTL of 2 days:
Cache encryption
The gateway uses AES-GCM to encrypt the cache, using a key derived from your API key. Results are cached for 1 week unless otherwise specified in request headers. This design ensures that the cache is only accessible to you. Braintrust cannot see your data and does not store or log API keys.Because the cache’s encryption key is your API key, cached results are scoped to an individual user. Braintrust customers can opt into sharing cached results across users within their organization.
Specify an organization
If you’re part of multiple organizations, specify which to use with thex-bt-org-name header:
Use project-level AI providers
When making gateway requests, your Braintrust API key identifies who is making the request. By default, the gateway uses your organization-level AI provider credentials. To use AI provider credentials configured at the project level, specify the project with thex-bt-project-id header. See Configure API keys for where to set organization-level vs project-level provider keys.
To find your project ID, navigate to your project’s configuration page and use the Copy Project ID button at the bottom of the page.
If you’re part of multiple organizations, you may also need to specify the organization using
x-bt-org-name (see Specify an organization). You can send both headers in the same request:Advanced configuration
Configure gateway behavior with these headers:- x-bt-use-cache:
auto | always | never- Control caching behavior - x-bt-cache-ttl: Seconds (max 604800) - Set cache TTL
- x-bt-org-name: Organization name - Specify organization for multi-org users
- x-bt-project-id: Project ID - Use project-level AI provider credentials
- x-bt-endpoint-name: Endpoint name - Use a specific configured endpoint
- x-bt-parent: Project/experiment/span - Enable logging to Braintrust
- x-bt-span-metadata: JSON string - Attach metadata to logged spans
Monitor gateway usage
Track gateway usage across your organization:- Create a project for gateway logs.
- Enable logging by setting the
x-bt-parentheader when calling the gateway (see Enable logging). - View logs in the Logs page.
- Create dashboards to track usage, costs, and errors.
x-bt-used-endpoint header, which specifies which of your configured providers was used to complete the request.
Migrate from the AI proxy
If you’re currently using the AI proxy, migrating to the gateway requires only a URL change. All headers, features, and SDK patterns are fully compatible.- Update the base URL from
https://api.braintrust.dev/v1/proxytohttps://gateway.braintrust.dev/v1. - No other code changes are required.
Next steps
- Deploy overview for more deployment options
- Deploy prompts to call versioned prompts through the gateway
- Deploy functions to deploy tools, scorers, and workflows
- Monitor deployments to track production performance and errors
- Manage environments to separate dev, staging, and production