Trace every step your AI takes
Capture every prompt, completion, tool call, and span. Then search, score, and turn failures into eval test cases.
Free to start · No credit card · 5-min setup

Trusted by AI teams at
From zero to full tracing in minutes
Two lines to start. Every prompt, span, and score is captured automatically.
import { wrapOpenAI, initLogger }
from "braintrust";
import OpenAI from "openai";
initLogger({
projectName: "My Project",
});
const client = wrapOpenAI(
new OpenAI()
);from braintrust import traced
@traced
async def retrieve_context(query):
return vector_db.search(query)
@traced
async def my_agent(query):
docs = await retrieve_context(
query
)
return await llm(query, docs)from braintrust import current_span
from autoevals import Factuality
score = Factuality()(
output=result, input=query
)
current_span().log(
scores={"factuality": score.score}
)
# Tag a failing trace →
# it joins your eval datasetWhat changes when Braintrust is part of your workflow
10x
Faster issue resolution
<10 min
Eval turnaround
25%
Accuracy improvement
45x
More feedback
Notion, Dropbox, Zapier, and Coursera use Braintrust to ship better AI. Get started free →
Works how your team works
Engineers instrument in code. Everyone else searches, reviews, and scores traces in the UI.
For engineers
import { wrapOpenAI, initLogger }
from "braintrust";
initLogger({ projectName: "My LLM" });
// Every call now logged automatically
const client = wrapOpenAI(new OpenAI());
// Nest your own code as spans
const result = await wrapTraced(
myAgentFn, { name: "my_agent" }
)(input);Works with OpenAI, Anthropic, Gemini, AWS Bedrock, LangChain, LangGraph, and more. No framework lock-in and no rewrites.
For leads & PMs

Click any trace to see the full span tree with inputs, outputs, latency per step, token usage, and scores. Filter by any field. No SQL required.
Built for AI tracing from the start
Every provider in one SDK
Works with OpenAI, Anthropic, Gemini, AWS Bedrock, Azure, and all major frameworks including LangChain, LangGraph, LlamaIndex, CrewAI, and more. No rewrites and no lock-in.
See full agent traces
Tool calls, retrieval steps, and reasoning chains all nest as child spans automatically. See the full decision path for any agent request across services and steps.
Traces become eval datasets
Tag any failing trace and it lands in a dataset. Use the same data structure in production and in evals, so the logs you debug with become the tests you ship with.
What our customers say
“Loop helps us understand trace details that would be impossible to scan manually.”
Allen Kleiner, AI Engineering Lead at Retool

Stop shipping on vibes
Set up your first trace in minutes
Free to start · No credit card required
Get started free