Skip to main content
Reasoning models like OpenAI’s o4, Anthropic’s Claude 3.5 Sonnet, and Google’s Gemini 2.5 Flash generate intermediate thinking steps before producing final responses. Braintrust provides unified support for these models across providers.
Hybrid deployments require v0.0.74 or later for reasoning support.

Configure reasoning

Three parameters control reasoning behavior:
  • reasoning_effort: Intensity of reasoning (low, medium, or high). Compatible with OpenAI’s parameter.
  • reasoning_enabled: Boolean to explicitly enable/disable reasoning output (no effect on OpenAI models, which default to “medium”)
  • reasoning_budget: Token budget for reasoning (use either reasoning_effort or reasoning_budget, not both)

Use in code

Braintrust provides type augmentation for reasoning parameters:
  • TypeScript: @braintrust/proxy/types extends OpenAI SDK types
  • Python: braintrust-proxy provides casting utilities and type-safe helpers

Basic usage

import { OpenAI } from "openai";
import "@braintrust/proxy/types";

const openai = new OpenAI({
  baseURL: `${process.env.BRAINTRUST_API_URL || "https://api.braintrust.dev"}/v1/proxy`,
  apiKey: process.env.BRAINTRUST_API_KEY,
});

const response = await openai.chat.completions.create({
  model: "claude-sonnet-4-5-20250929",
  reasoning_effort: "medium",
  messages: [
    {
      role: "user",
      content: "What's 15% of 240?",
    },
  ],
});

// Access final response
console.log(response.choices[0].message.content);
// Output: "15% of 240 is 36."

// Access reasoning steps
console.log(response.choices[0].reasoning);
// Output: Array of reasoning objects with step-by-step calculation

Reasoning structure

Reasoning steps include unique IDs and content:
[
  {
    "id": "reasoning_step_1",
    "content": "I need to calculate 15% of 240..."
  },
  {
    "id": "reasoning_step_2",
    "content": "240 × 0.15 = 36..."
  }
]
The id field contains provider-specific signatures that must be preserved in multi-turn conversations. Always use exact IDs returned by the provider.

Stream reasoning

Reasoning streams through delta.reasoning in streaming responses:
import { OpenAI } from "openai";
import "@braintrust/proxy/types";

const openai = new OpenAI({
  baseURL: `${process.env.BRAINTRUST_API_URL || "https://api.braintrust.dev"}/v1/proxy`,
  apiKey: process.env.BRAINTRUST_API_KEY,
});

const stream = await openai.chat.completions.create({
  model: "claude-sonnet-4-5-20250929",
  reasoning_effort: "high",
  stream: true,
  messages: [
    {
      role: "user",
      content: "Explain quantum entanglement in simple terms.",
    },
  ],
});

for await (const chunk of stream) {
  const delta = chunk.choices[0]?.delta;

  // Handle regular content
  if (delta?.content) {
    process.stdout.write(delta.content);
  }

  // Handle reasoning deltas
  if (delta?.reasoning) {
    console.log("\nReasoning step:", delta.reasoning);
  }
}

Multi-turn conversations

Include reasoning from previous turns to let models build on earlier thinking:
import { OpenAI } from "openai";
import "@braintrust/proxy/types";

const openai = new OpenAI({
  baseURL: `${process.env.BRAINTRUST_API_URL || "https://api.braintrust.dev"}/v1/proxy`,
  apiKey: process.env.BRAINTRUST_API_KEY,
});

const firstResponse = await openai.chat.completions.create({
  model: "claude-sonnet-4-5-20250929",
  reasoning_effort: "medium",
  messages: [
    {
      role: "user",
      content: "What's the best approach to solve a complex math problem?",
    },
  ],
});

// Include previous reasoning in next turn
const secondResponse = await openai.chat.completions.create({
  model: "claude-sonnet-4-5-20250929",
  reasoning_effort: "medium",
  messages: [
    {
      role: "user",
      content: "What's the best approach to solve a complex math problem?",
    },
    {
      role: "assistant",
      content: firstResponse.choices[0].message.content,
      reasoning: firstResponse.choices[0].reasoning,
    },
    {
      role: "user",
      content: "Now apply that approach to solve: 2x² + 5x - 3 = 0",
    },
  ],
});

Test in playgrounds

Use playgrounds to test reasoning models interactively:
  1. Select a reasoning-capable model
  2. Set reasoning_effort in parameters
  3. Run evaluations
  4. View reasoning steps in trace view
Reasoning steps appear as separate spans in the trace, making it easy to understand the model’s thinking process.

Evaluate reasoning quality

Create scorers that evaluate both final outputs and reasoning steps:
project.scorers.create({
  name: "Reasoning quality",
  slug: "reasoning-quality",
  messages: [
    {
      role: "user",
      content:
        'Evaluate the reasoning steps: {{reasoning}}\n\nAre they logical and complete? Return "A" for excellent, "B" for adequate, "C" for poor.',
    },
  ],
  model: "gpt-4o",
  choiceScores: {
    A: 1,
    B: 0.5,
    C: 0,
  },
});
This helps you understand whether models are using sound reasoning paths to reach conclusions.

Next steps