Latest articles

How to make requests to Gemini using the OpenAI SDK

20 March 2026Braintrust Team
TL;DR

Braintrust's unified AI Gateway lets you call Google's Gemini models using the OpenAI SDK. Point your OpenAI client at https://gateway.braintrust.dev/v1, authenticate with your Braintrust API key, and pass a Gemini model name such as gemini-2.5-pro in the model parameter. Braintrust translates the OpenAI request format into Gemini's API format and returns the response in OpenAI's standard structure, so your existing code continues to work without SDK-level changes.

Why call Gemini from the OpenAI SDK and how Braintrust helps

Most engineering teams build their LLM stack around the OpenAI SDK because it became the default client library for LLM APIs early on. Many tutorials, developer tools, and application frameworks still assume OpenAI's request format, so many production codebases already depend on that interface.

Gemini models can be the better fit for some workloads as they offer long context windows, multimodal support for inputs such as images, video, and audio, and competitive pricing for high-volume use cases. For tasks such as long-document summarization, codebase analysis, or visual content processing, Gemini can be used without replacing the SDK patterns the applications already rely on.

Switching from OpenAI mid-project to access Gemini means introducing a different SDK, changing the client setup, adapting requests to a different API structure, and retesting application logic built around OpenAI's interface. The provider switch creates migration work, especially when the main goal is simply to compare models or route some workloads to Gemini.

Braintrust's AI Gateway removes model-switching integration overhead by keeping the OpenAI SDK configuration and request pattern already in use while routing requests to Gemini through the same client. Instead of rebuilding the application around a second provider-specific implementation, developers can test and adopt Gemini inside the interface that their codebase already supports.

How to make requests to Gemini using Braintrust's OpenAI SDK

The Braintrust AI Gateway sits between your application and the model provider, converting OpenAI-formatted requests into Gemini's format and returning Gemini's responses in the OpenAI structure your client already expects. The following steps walk through the full setup.

Prerequisites

1. Create a Braintrust account and generate an API key: Sign up at Braintrust, navigate to Settings > Organization > API keys, and click + API key. Enter a name for the key and click Create. Copy the key immediately, as it won't be shown again. The key (prefixed sk-) authenticates your requests to Braintrust's AI Gateway.

For production workloads such as CI/CD pipelines or backend services, create a service token by going to Settings > Organization > Service tokens and clicking + Service token. Service tokens use the bt-st- prefix and work anywhere API keys are accepted, but they are tied to a service account rather than an individual user.

2. Add your Google Gemini API key to Braintrust: Navigate to Settings > Organization > AI providers in Braintrust. Click the Google/Gemini provider section and enter your Gemini API key, then click Save. The gateway uses the stored key to authenticate with Google on your behalf, so your Gemini credentials never need to appear in your application code. Braintrust encrypts all provider API keys using AES-256 with unique keys and nonces rather than storing them as plaintext.

3. Install the OpenAI SDK: Run npm install openai for TypeScript/JavaScript or pip install openai for Python.

Calling Gemini with the OpenAI SDK through Braintrust

Using Gemini through Braintrust requires only a small change to your existing OpenAI client setup. The baseURL points to Braintrust's gateway instead of OpenAI's API, while the apiKey uses your Braintrust API key to authenticate the request and look up your stored Gemini credentials. The model parameter is set to gemini-2.5-pro, and the gateway routes the request to Google accordingly. Because the Gemini response arrives in the standard OpenAI format, response.choices[0].message.content returns Gemini's output, and your existing response parsing logic works without modification.

typescript

const client = new OpenAI({
  baseURL: "https://gateway.braintrust.dev/v1",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

// Call Google's Gemini using the OpenAI SDK
const response = await client.chat.completions.create({
  model: "gemini-2.5-pro",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(response.choices[0].message.content);

Here is the equivalent in Python:

python
import os

from openai import OpenAI

client = OpenAI(
    base_url="https://gateway.braintrust.dev/v1",
    api_key=os.environ["BRAINTRUST_API_KEY"],
)

# Call Google's Gemini using the OpenAI SDK
response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)

Enabling logging and caching

Logging

Logging tracks every Gemini request inside a Braintrust project. Add the x-bt-parent header to the request using span.export().

typescript


const logger = initLogger({ projectName: "My Project" });

await logger.traced(async (span) => {
  const client = new OpenAI({
    baseURL: "https://gateway.braintrust.dev/v1",
    apiKey: process.env.BRAINTRUST_API_KEY,
  });

  const response = await client.chat.completions.create(
    {
      model: "gemini-2.5-pro",
      messages: [{ role: "user", content: "Hello!" }],
    },
    {
      headers: {
        "x-bt-parent": await span.export(),
      },
    },
  );

  console.log(response.choices[0].message.content);
});

Each logged request captures latency, token count, cost, and the full input/output pair. These traces appear in your project's logs page in Braintrust, where you can filter them by model, time range, or custom metadata.

Caching

Caching stores responses and reuses them when the same prompt is sent again. Set the x-bt-use-cache header to always to enable caching for all requests.

typescript

const client = new OpenAI({
  baseURL: "https://gateway.braintrust.dev/v1",
  defaultHeaders: {
    "x-bt-use-cache": "always",
  },
  apiKey: process.env.BRAINTRUST_API_KEY,
});

Cached responses are encrypted with AES-GCM using a key derived from your API key, so only you can access your cached data. The default cache TTL is one week, though you can customize it per request using the x-bt-cache-ttl header. Caching is especially useful during prompt development and evaluation cycles, where you repeatedly test the same inputs and want fast, cost-free responses.

Use the Braintrust AI Gateway to standardize on one SDK across providers

The Gemini-through-OpenAI-SDK setup described in this guide is one example of how the Braintrust AI Gateway can simplify access to multi-provider models. The AI Gateway also supports using any supported integration's SDK to call models from any AI provider, so teams can standardize on one SDK across the codebase while still accessing all supported models.

Braintrust supports direct integrations with all major model providers, including:

LLM model providers: OpenAI, Anthropic, Gemini, Mistral, Groq, Fireworks, Together, xAI, Perplexity, Replicate, Cerebras, Baseten, and Lepton.

Cloud platform providers: AWS Bedrock, Vertex AI, Azure OpenAI, and Databricks.

If you need models not on the list, add custom providers through Braintrust's custom provider configuration, which supports self-hosted models, fine-tuned models, and proprietary AI endpoints.

Braintrust's unified LLM gateway accepts requests from the OpenAI SDK, the Anthropic SDK, and the Google Gemini SDK. You can keep the SDK that already fits your application, point it to Braintrust's gateway URL, and route requests to any supported provider by changing the model name. All provider API keys live in Braintrust's organization settings rather than in application code, so adding a new provider to the stack takes a few clicks in the Braintrust dashboard.

Want to access multiple model providers without maintaining separate SDK integrations? Start free with Braintrust or schedule a demo.

FAQs

Do I need separate API keys for OpenAI and Gemini when using the Braintrust AI Gateway?

Your application code only needs a Braintrust API key. Provider-specific keys, such as your OpenAI key or Google Gemini key, are stored in Braintrust's organization settings under AI Providers, so they stay out of your codebase while Braintrust handles provider routing at the gateway level.

Can I compare OpenAI and Gemini model outputs using Braintrust?

With logging enabled in Braintrust, you can run the same prompts against both providers and compare the results in Braintrust's experiment comparison view. The view makes it easier to evaluate output quality, latency, and cost side by side without building a separate comparison workflow in your application code.

How do I monitor costs for OpenAI calls vs. Gemini calls?

Enable logging to a Braintrust project, and Braintrust will capture token usage and per-request cost data for each model invocation, then break down the data by model and provider in Braintrust's dashboards to compare OpenAI and Gemini usage over time.