Gemini

Google’s Gemini models include Gemini 2.0 Flash, Gemini 2.5 Pro, and other advanced multimodal language models. Braintrust integrates seamlessly with Gemini through direct API access, wrapper functions for automatic tracing, and proxy support.

Setup

To use Gemini models, configure your Gemini API key in Braintrust.

Get a Gemini API key from Google AI Studio
Add the Gemini API key to your organization’s AI providers
Set the Gemini API key and your Braintrust API key as environment variables

.env

GEMINI_API_KEY=<your-gemini-api-key>
BRAINTRUST_API_KEY=<your-braintrust-api-key>

# If you are self-hosting Braintrust, set the URL of your hosted dataplane
# BRAINTRUST_API_URL=<your-braintrust-api-url-here>

API keys are encrypted using 256-bit AES-GCM encryption and are not stored or logged by Braintrust.

Trace with Gemini

Trace your Gemini LLM calls for observability and monitoring using either the native Google GenAI SDK or the Braintrust AI proxy.

Trace automatically with native Google GenAI SDK

Braintrust provides wrapper functions that automatically log Google GenAI API calls. All subsequent API calls will be automatically traced.

These wrapper functions are convenience functions that integrate the Braintrust logger with the Google GenAI client. For more control, see the manual wrapping section below.

Install the required packages:

pnpm add braintrust @google/genai

Then wrap the Google GenAI client:

import * as googleGenAI from "@google/genai";
import { wrapGoogleGenAI, initLogger } from "braintrust";

// Initialize Braintrust tracing
initLogger({ projectName: "My Project" });

// Use wrapGoogleGenAI to wrap the Google GenAI module for automatic tracing
const { GoogleGenAI } = wrapGoogleGenAI(googleGenAI);

// Create a native Google GenAI client
const client = new GoogleGenAI({
  apiKey: process.env.GEMINI_API_KEY || "",
});

// All API calls are automatically logged
const response = await client.models.generateContent({
  model: "gemini-2.0-flash-001",
  contents: "What is machine learning?",
  config: {
    maxOutputTokens: 100,
  },
});
console.log(response.text);

Stream responses with native Google GenAI SDK

The native Google GenAI client supports streaming with automatic tracing of token metrics.

const stream = await client.models.generateContentStream({
  model: "gemini-2.0-flash-001",
  contents: "Count from 1 to 10 slowly.",
  config: {
    maxOutputTokens: 200,
  },
});

// All streaming chunks are automatically logged
for await (const chunk of stream) {
  if (chunk.text) {
    process.stdout.write(chunk.text);
  }
}

Manual wrapping for more control

If you need more control over when tracing is enabled, you can manually wrap the client.

import * as googleGenAI from "@google/genai";
import { wrapGoogleGenAI, initLogger } from "braintrust";

initLogger({ projectName: "My Project" });

// Wrap only when needed
const { GoogleGenAI } = wrapGoogleGenAI(googleGenAI);

const client = new GoogleGenAI({
  apiKey: process.env.GEMINI_API_KEY || "",
});

const response = await client.models.generateContent({
  model: "gemini-2.0-flash-001",
  contents: "Hello, world!",
});

console.log(response.text);

Use Gemini with Braintrust AI proxy

The Braintrust AI Proxy allows you to access Gemini models through a unified OpenAI-compatible interface. Install the braintrust and openai packages.

pnpm add braintrust openai

Then, initialize the client and make a request to a Gemini model via the Braintrust AI Proxy.

import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://api.braintrust.dev/v1/proxy",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

const response = await client.chat.completions.create({
  model: "gemini-2.0-flash",
  messages: [{ role: "user", content: "Hello, world!" }],
});

Trace AI proxy calls

When using the Braintrust AI Proxy, API calls are automatically logged to the specified project.

import { OpenAI } from "openai";
import { initLogger } from "braintrust";

initLogger({
  projectName: "My Project",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

const client = new OpenAI({
  baseURL: "https://api.braintrust.dev/v1/proxy",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

// All API calls are automatically logged
const result = await client.chat.completions.create({
  model: "gemini-2.0-flash",
  messages: [{ role: "user", content: "What is machine learning?" }],
});

Stream with proxy

Gemini models support streaming through the proxy.

const stream = await client.chat.completions.create({
  model: "gemini-2.0-flash",
  messages: [{ role: "user", content: "Count to 10" }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

Evaluate with Gemini

Evaluations distill the non-deterministic outputs of Gemini models into an effective feedback loop that enables you to ship more reliable, higher quality products. Braintrust Eval is a simple function composed of a dataset of user inputs, a task, and a set of scorers. To learn more about evaluations, see the Experiments guide.

Evaluate with native SDK

import * as googleGenAI from "@google/genai";
import { Eval, wrapGoogleGenAI, initLogger } from "braintrust";

// Setup tracing
initLogger({ projectName: "Gemini Evaluation" });
const { GoogleGenAI } = wrapGoogleGenAI(googleGenAI);

const client = new GoogleGenAI({
  apiKey: process.env.GEMINI_API_KEY || "",
});

Eval("Gemini Native Evaluation", {
  data: () => [
    { input: "What is 2+2?", expected: "4" },
    { input: "What is the capital of France?", expected: "Paris" },
  ],
  task: async (input) => {
    const response = await client.models.generateContent({
      model: "gemini-2.0-flash-001",
      contents: input,
      config: {
        maxOutputTokens: 100,
      },
    });
    return response.text;
  },
  scores: [
    {
      name: "accuracy",
      scorer: (args) => (args.output === args.expected ? 1 : 0),
    },
  ],
});

Evaluate with proxy

import { Eval } from "braintrust";
import { OpenAI } from "openai";

const client = new OpenAI({
  baseURL: "https://api.braintrust.dev/v1/proxy",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

Eval("Gemini Evaluation", {
  data: () => [
    { input: "What is 2+2?", expected: "4" },
    { input: "What is the capital of France?", expected: "Paris" },
  ],
  task: async (input) => {
    const response = await client.chat.completions.create({
      model: "gemini-2.0-flash",
      messages: [{ role: "user", content: input }],
    });
    return response.choices[0].message.content;
  },
  scores: [
    {
      name: "accuracy",
      scorer: (args) => (args.output === args.expected ? 1 : 0),
    },
  ],
});

Additional features

Reasoning models

Gemini’s reasoning models like gemini-2.0-flash-thinking-exp-1219 provide detailed thought processes before generating responses. The wrapper automatically captures both the reasoning tokens and the final response.

import * as googleGenAI from "@google/genai";
import { wrapGoogleGenAI, initLogger } from "braintrust";

// Setup automatic tracing
initLogger({ projectName: "My Project" });
const { GoogleGenAI } = wrapGoogleGenAI(googleGenAI);

const client = new GoogleGenAI({
  apiKey: process.env.GEMINI_API_KEY || "",
});

// Use reasoning model - reasoning tokens are automatically tracked
const response = await client.models.generateContent({
  model: "gemini-2.0-flash-thinking-exp-1219",
  contents: "What is the derivative of x^2 + 3x + 5? Think step by step.",
  config: {
    maxOutputTokens: 1000,
  },
});

// The response includes both the reasoning and final answer
console.log(response.text);

// Metrics automatically include reasoning tokens
// The wrapper captures completion_reasoning_tokens in the metrics

Structured outputs

Gemini supports structured JSON outputs using response schemas.

import * as googleGenAI from "@google/genai";
import { wrapGoogleGenAI, initLogger } from "braintrust";

// Setup automatic tracing
initLogger({ projectName: "My Project" });
const { GoogleGenAI } = wrapGoogleGenAI(googleGenAI);

const client = new GoogleGenAI({
  apiKey: process.env.GEMINI_API_KEY || "",
});

// Define a schema for the response
interface Person {
  name: string;
  age: number;
  occupation: string;
}

const response = await client.models.generateContent({
  model: "gemini-1.5-flash",
  contents:
    "Extract information about: John Smith is a 30-year-old software engineer.",
  config: {
    responseMimeType: "application/json",
    responseSchema: {
      type: "object",
      properties: {
        name: { type: "string" },
        age: { type: "number" },
        occupation: { type: "string" },
      },
      required: ["name", "age", "occupation"],
    },
    maxOutputTokens: 200,
  },
});

// Parse the JSON response
const personData: Person = JSON.parse(response.text);
console.log(`Name: ${personData.name}, Age: ${personData.age}`);

Function calling and tools

Gemini supports function calling for building AI agents with tools.

import * as googleGenAI from "@google/genai";
import { wrapGoogleGenAI, initLogger } from "braintrust";

// Setup automatic tracing
initLogger({ projectName: "My Project" });
const { GoogleGenAI } = wrapGoogleGenAI(googleGenAI);

const client = new GoogleGenAI({
  apiKey: process.env.GEMINI_API_KEY || "",
});

// Define functions for the model to call
function getWeather(location: string, unit: string = "celsius"): string {
  // In a real app, this would call a weather API
  return `22 degrees ${unit} and sunny in ${location}`;
}

function searchWeb(query: string): string {
  return `Search results for: ${query}`;
}

// Define function declarations
const tools = [
  {
    functionDeclarations: [
      {
        name: "get_weather",
        description: "Get the current weather for a location",
        parameters: {
          type: "object",
          properties: {
            location: {
              type: "string",
              description: "The city and state, e.g. San Francisco, CA",
            },
            unit: {
              type: "string",
              enum: ["celsius", "fahrenheit"],
              description: "The unit of temperature",
            },
          },
          required: ["location"],
        },
      },
      {
        name: "search_web",
        description: "Search the web for information",
        parameters: {
          type: "object",
          properties: {
            query: {
              type: "string",
              description: "The search query",
            },
          },
          required: ["query"],
        },
      },
    ],
  },
];

// Generate with tools
const response = await client.models.generateContent({
  model: "gemini-1.5-flash",
  contents:
    "What's the weather in Paris and what tourist sites should I visit?",
  config: {
    tools: tools,
    maxOutputTokens: 500,
  },
});

// Handle function calls
if (response.candidates[0].content.parts) {
  for (const part of response.candidates[0].content.parts) {
    if (part.functionCall) {
      const fc = part.functionCall;
      console.log(`Function: ${fc.name}`);
      console.log(`Arguments: ${JSON.stringify(fc.args)}`);

      // Execute the function
      if (fc.name === "get_weather") {
        const result = getWeather(fc.args.location, fc.args.unit);
        // Send result back to model for final response
      }
    }
  }
}

Multimodal content

Gemini models support multimodal inputs including images, audio, and video.

import * as googleGenAI from "@google/genai";
import { wrapGoogleGenAI, initLogger } from "braintrust";
import * as fs from "fs";

// Setup automatic tracing
initLogger({ projectName: "My Project" });
const { GoogleGenAI } = wrapGoogleGenAI(googleGenAI);

const client = new GoogleGenAI({
  apiKey: process.env.GEMINI_API_KEY || "",
});

// Image analysis
const imageData = fs.readFileSync("image.jpg");

const response = await client.models.generateContent({
  model: "gemini-1.5-flash",
  contents: [
    { text: "What's in this image?" },
    {
      inlineData: {
        mimeType: "image/jpeg",
        data: imageData.toString("base64"),
      },
    },
  ],
});

// Audio transcription
const audioData = fs.readFileSync("audio.mp3");

const audioResponse = await client.models.generateContent({
  model: "gemini-1.5-flash",
  contents: [
    { text: "Transcribe this audio:" },
    {
      inlineData: {
        mimeType: "audio/mp3",
        data: audioData.toString("base64"),
      },
    },
  ],
});

// The wrapper automatically handles binary data serialization
// Binary attachments are converted to Braintrust Attachment objects

Streaming with token metrics

Stream responses with automatic token tracking.

import * as googleGenAI from "@google/genai";
import { wrapGoogleGenAI, initLogger } from "braintrust";

// Setup automatic tracing
initLogger({ projectName: "My Project" });
const { GoogleGenAI } = wrapGoogleGenAI(googleGenAI);

const client = new GoogleGenAI({
  apiKey: process.env.GEMINI_API_KEY || "",
});

// Stream responses - automatically tracked
const stream = await client.models.generateContentStream({
  model: "gemini-1.5-flash",
  contents: "Write a story about a robot learning to paint.",
  config: {
    maxOutputTokens: 500,
  },
});

// Streaming automatically tracks:
// - time_to_first_token
// - prompt_tokens, completion_tokens, total_tokens
// - prompt_cached_tokens (if using caching)
for await (const chunk of stream) {
  if (chunk.text) {
    process.stdout.write(chunk.text);
  }
}

Context caching

Gemini supports context caching for efficient reuse of large contexts.

import * as googleGenAI from "@google/genai";
import { wrapGoogleGenAI, initLogger } from "braintrust";

// Setup automatic tracing
initLogger({ projectName: "My Project" });
const { GoogleGenAI } = wrapGoogleGenAI(googleGenAI);

const client = new GoogleGenAI({
  apiKey: process.env.GEMINI_API_KEY || "",
});

// Create a cache for a large document
const documentContent = "... very long document content ...";

// Note: Caching API requires the full Vertex AI SDK
// This example shows the structure - refer to Google's documentation
// for complete caching implementation

const response = await client.models.generateContent({
  model: "gemini-1.5-flash",
  contents: "Summarize the key points from the document",
  config: {
    // cachedContent would be configured here
    maxOutputTokens: 500,
  },
});

// The wrapper tracks cached tokens in metrics
// Look for prompt_cached_tokens in the logged metrics

Error handling, attachments, and masking sensitive data

To learn more about these topics, check out the customize traces guide.

import { OpenAI } from "openai";
import "@braintrust/proxy/types"; // for type safety

const client = new OpenAI({
  baseURL: "https://api.braintrust.dev/v1/proxy",
  apiKey: process.env.BRAINTRUST_API_KEY,
});

const response = await client.chat.completions.create({
  model: "gemini-2.5-flash-preview-05-20",
  reasoning_enabled: true,
  reasoning_budget: 1024,
  messages: [{ role: "user", content: "How many rs in 'ferrocarril'?" }],
});

console.log(response.choices[0].reasoning); // Access reasoning steps

To learn more about multimodal support, attachments, error handling, and masking sensitive data with Gemini, visit the customize traces guide.

Get started

Core

Context

Integrations

Best practices

Reference

Setup

Trace with Gemini

Trace automatically with native Google GenAI SDK

Stream responses with native Google GenAI SDK

Manual wrapping for more control

Use Gemini with Braintrust AI proxy

Trace AI proxy calls

Stream with proxy

Evaluate with Gemini

Evaluate with native SDK

Evaluate with proxy

Additional features

Reasoning models

Structured outputs

Function calling and tools

Multimodal content

Streaming with token metrics

Context caching

Error handling, attachments, and masking sensitive data

Get started

Core

Context

Integrations

Best practices

Reference

​Setup

​Trace with Gemini

​Trace automatically with native Google GenAI SDK

​Stream responses with native Google GenAI SDK

​Manual wrapping for more control

​Use Gemini with Braintrust AI proxy

​Trace AI proxy calls

​Stream with proxy

​Evaluate with Gemini

​Evaluate with native SDK

​Evaluate with proxy

​Additional features

​Reasoning models

​Structured outputs

​Function calling and tools

​Multimodal content

​Streaming with token metrics

​Context caching

​Error handling, attachments, and masking sensitive data

Setup

Trace with Gemini

Trace automatically with native Google GenAI SDK

Stream responses with native Google GenAI SDK

Manual wrapping for more control

Use Gemini with Braintrust AI proxy

Trace AI proxy calls

Stream with proxy

Evaluate with Gemini

Evaluate with native SDK

Evaluate with proxy

Additional features

Reasoning models

Structured outputs

Function calling and tools

Multimodal content

Streaming with token metrics

Context caching

Error handling, attachments, and masking sensitive data