Read
Evals are a team sport: How we built Loop
How we debugged Loop's prompt optimization workflow by combining manual review, Loop analysis, and cross-functional collaboration.
25 November 20258 min
Read
Turn production data into better AI with Loop
Loop is the AI assistant that helps teams query, analyze, and improve AI applications faster.
24 November 20255 min
Read
Braintrust Java SDK: AI observability and evals for the JVM
AI observability and evaluation tools for Java applications, built on OpenTelemetry.
23 October 20254 min
Read
Braintrust on the Vercel Marketplace
Run evals, monitor model quality, and more from your Vercel dashboard.
16 October 20254 min
Read
AI that knows your data
Introducing Braintrust's MCP server.
9 September 20253 min
Read
GPT-5 vs. Claude Opus 4.1
Which one you should ship with, and how to know for sure.
8 August 20255 min
Read
Braintrust is not an eval framework
Why we built infrastructure for AI products, not just another evaluation tool.
14 July 20257 min
Read
Building with Grok 4
xAI recently announced Grok 4. We put it to the ultimate test.
11 July 20259 min
Read
Eval playgrounds for faster, focused iteration
Run full evals directly in a powerful editor UI.
27 May 20253 min
Read
Brainstore is now on by default
Brainstore is now the default in both our UI and API. Learn what's changing and coming next.
31 March 20254 min
Read
Brainstore: the database designed for the AI engineering era
LLM observability, now 80x faster.
3 March 20259 min
Read
Bedrock, Vertex AI, and universal structured outputs
Full support for Bedrock, Vertex AI, and structured outputs in the AI proxy and playground.
14 February 20252 min
Read
Our approach to hybrid deployment
The easiest way to self-host Braintrust.
8 January 20254 min
Read
The top 10 most loved features of 2024
Our year in review.
31 December 20243 min
Read
New monitor page for easy analytics
More visibility into performance across logs and experiments.
18 December 20242 min
Read
Evaluating Gemini models for vision
Faster, more efficient, and highly accurate for real-world applications.
14 November 20245 min
Read
Python tool functions: powered by uv
How we used the uv library to build Python tools.
13 November 20242 min
Read
Building serverless apps with the OpenAI Realtime API
No server setup or configuration necessary.
4 November 20246 min
Read
Logging with attachments
Observability for advanced AI applications.
24 October 20243 min
Read
Functions: flexible AI engineering primitives
Introducing functions, a general-purpose primitive for building, evaluating, and observing AI products.
8 October 20246 min
Read
Custom scoring functions in the Braintrust Playground
Create custom scorers and access them via the Braintrust UI and API.
16 September 20243 min
Read
2023, a year in review
Check out your Braintrust 2023 year in review to see how you did this year!
21 December 20231 min
Read
AI proxy: fostering a more open ecosystem
Introducing Braintrust's latest feature: an AI proxy that lets you use open source models like LLaMa 2 and Mistral, as well as all of OpenAI's and Anthropic's models, behind a single interface with caching, security, and API key management built in.
20 November 20237 min

Bring structure to your AI agent development