AI tracing

Trace every step your AI takes

Capture every prompt, completion, tool call, and span. Then search, score, and turn failures into eval test cases.

Free to start · No credit card · 5-min setup

Braintrust logs view showing traces with scores

Trusted by AI teams at

Watch video
Read story
Watch video
Watch video
Read story
Watch video

From zero to full tracing in minutes

Two lines to start. Every prompt, span, and score is captured automatically.

1
import { wrapOpenAI, initLogger }
  from "braintrust";
import OpenAI from "openai";

initLogger({
  projectName: "My Project",
});

const client = wrapOpenAI(
  new OpenAI()
);
Wrap your AI provider
Two lines log every OpenAI, Anthropic, or Gemini call automatically with inputs, outputs, tokens, cost, and latency. No manual instrumentation required.
2
from braintrust import traced

@traced
async def retrieve_context(query):
  return vector_db.search(query)

@traced
async def my_agent(query):
  docs = await retrieve_context(
    query
  )
  return await llm(query, docs)
Add spans for your own code
One decorator per function. Retrieval steps, tool calls, and agent reasoning chains all nest automatically as child spans. No manual parent-child wiring.
3
from braintrust import current_span
from autoevals import Factuality

score = Factuality()(
  output=result, input=query
)
current_span().log(
  scores={"factuality": score.score}
)

# Tag a failing trace →
# it joins your eval dataset
Search, score, and act
Filter traces by score, model, latency, or any metadata. Run LLM-as-judge scorers on live traffic. Tag a failure and make it an eval test case instantly.

What changes when Braintrust is part of your workflow

10x

Faster issue resolution

<10 min

Eval turnaround

25%

Accuracy improvement

45x

More feedback

Notion, Dropbox, Zapier, and Coursera use Braintrust to ship better AI. Get started free →

Works how your team works

Engineers instrument in code. Everyone else searches, reviews, and scores traces in the UI.

For engineers

import { wrapOpenAI, initLogger }
  from "braintrust";

initLogger({ projectName: "My LLM" });

// Every call now logged automatically
const client = wrapOpenAI(new OpenAI());

// Nest your own code as spans
const result = await wrapTraced(
  myAgentFn, { name: "my_agent" }
)(input);

Works with OpenAI, Anthropic, Gemini, AWS Bedrock, LangChain, LangGraph, and more. No framework lock-in and no rewrites.

For leads & PMs

Braintrust logs view with QA scores and trace detail panel

Click any trace to see the full span tree with inputs, outputs, latency per step, token usage, and scores. Filter by any field. No SQL required.

Built for AI tracing from the start

Every provider in one SDK

Works with OpenAI, Anthropic, Gemini, AWS Bedrock, Azure, and all major frameworks including LangChain, LangGraph, LlamaIndex, CrewAI, and more. No rewrites and no lock-in.

See full agent traces

Tool calls, retrieval steps, and reasoning chains all nest as child spans automatically. See the full decision path for any agent request across services and steps.

Traces become eval datasets

Tag any failing trace and it lands in a dataset. Use the same data structure in production and in evals, so the logs you debug with become the tests you ship with.

What our customers say

“Loop helps us understand trace details that would be impossible to scan manually.”

Allen Kleiner, AI Engineering Lead at Retool

Stop shipping on vibes

Set up your first trace in minutes

Free to start · No credit card required

Get started free