Reasoning
If you are on a hybrid deployment, reasoning support is available starting with v0.0.74
.
Reasoning models like OpenAI’s o4, Anthropic’s Claude 3.5 Sonnet, and Google’s Gemini 2.5 Flash generate intermediate reasoning steps before producing a final response. Braintrust provides unified support for these models, so you can work with reasoning outputs no matter which provider you choose.
You can use reasoning models in both playgrounds and programmatically through the SDK.
Parameters
Three parameters control reasoning behavior:
reasoning_effort
: Controls the intensity of reasoning (compatible with OpenAI's parameter). The value can be set to low, medium, or high.reasoning_enabled
: A boolean flag to explicitly enable or disable reasoning output. Note: This parameter has no effect when using OpenAI models, which default to "medium" reasoning effort unless specified by you.reasoning_budget
: Specifies a token budget for the reasoning process. You must provide eitherreasoning_effort
orreasoning_budget
, not both.
To facilitate working with reasoning models in your codebase, Braintrust offers type augmentation packages for the OpenAI SDK:
@braintrust/proxy/types
(TypeScript/JavaScript): Extends OpenAI's TypeScript definitions to include Braintrust-specific reasoning parameters and response fields. (npm, GitHub)braintrust-proxy
(Python): Provides casting utilities and type-safe helpers for using reasoning parameters and accessing reasoning responses. (PyPi, GitHub)
Since reasoning features extend the standard OpenAI API interface, these packages are necessary. They ensure type safety and autocomplete support in your IDE for reasoning-specific parameters (like reasoning_effort
) and response fields (reasoning
), while maintaining compatibility with your existing OpenAI SDK integration.
You can add reasoning parameters to any chat completion request when using Braintrust's proxy with your OpenAI SDK:
The id
field contains a unique identifier for each reasoning step. For providers like Anthropic, these IDs are signatures that must be preserved when including reasoning in multi-turn conversations. Always use the exact ID returned by the provider. Learn more in the multi-turn conversations section.
Streaming
For streaming responses, reasoning is delivered through deltas
objects as a new reasoning
property:
Multi-turn conversations
You can include reasoning from previous turns in multi-turn conversations, allowing the model to build upon its previous thinking: