Anthropic model provider configuration and integration guide
Anthropic provides access to Claude models including Claude 4 Sonnet, Claude 4.1 Opus, and other cutting-edge language models. Braintrust integrates seamlessly with Anthropic through direct API access, wrapAnthropic wrapper functions for automatic tracing, and proxy support.
Add the Anthropic API key to your organization’s AI providers
Set the Anthropic API key and your Braintrust API key as environment variables
.env
Report incorrect code
Copy
Ask AI
ANTHROPIC_API_KEY=<your-anthropic-api-key>BRAINTRUST_API_KEY=<your-braintrust-api-key># If you are self-hosting Braintrust, set the URL of your hosted dataplane# BRAINTRUST_API_URL=<your-braintrust-api-url>
API keys are encrypted using 256-bit AES-GCM encryption and are not stored or logged by Braintrust.
Install the braintrust and @anthropic-ai/sdk packages.
Braintrust provides automatic tracing for Anthropic API calls. Braintrust handles streaming, metric collection (including cached tokens), and other details.
TypeScript & Python: Use wrapAnthropic / wrap_anthropic wrapper functions
Go: Use the tracing middleware with the Anthropic client
Ruby: Use Braintrust::Trace::Anthropic.wrap to wrap the Anthropic client
Java: Use the tracing interceptor with the Anthropic client
import Anthropic from "@anthropic-ai/sdk";import { wrapAnthropic, initLogger } from "braintrust";// Initialize the Braintrust loggerconst logger = initLogger({ projectName: "My Project", // Your project name apiKey: process.env.BRAINTRUST_API_KEY,});// Wrap the Anthropic client with the Braintrust loggerconst client = wrapAnthropic( new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY }),);// All API calls are automatically loggedconst result = await client.messages.create({ model: "claude-sonnet-4-5-20250929", max_tokens: 1024, messages: [{ role: "user", content: "What is machine learning?" }],});
Evaluations distill the non-deterministic outputs of Anthropic models into an effective feedback loop that enables you to ship more reliable, higher quality products. The Braintrust Eval function is composed of a dataset of user inputs, a task, and a set of scorers. To learn more about evaluations, see the Experiments guide.
You can use Anthropic models to score the outputs of other AI systems. This example uses the LLMClassifierFromSpec scorer to score the relevance of the outputs of an AI system.Install the autoevals package to use the LLMClassifierFromSpec scorer.
Create a scorer that uses the LLMClassifierFromSpec scorer to score the relevance of the output. You can then include relevanceScorer as a scorer in your Eval function (see above).
Anthropic models support system prompts for better instruction following.
Report incorrect code
Copy
Ask AI
import Anthropic from "@anthropic-ai/sdk";const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });const response = await client.messages.create({ model: "claude-sonnet-4-5-20250929", max_tokens: 1024, system: "You are a helpful assistant that responds in JSON format.", messages: [{ role: "user", content: "What is the capital of France?" }],});