Latest Braintrust news
GPT-5 vs. Claude Opus 4.1
Which one you should ship with, and how to know for sure.
8 August 2025
Read
Read
The canonical agent architecture: A while loop with tools
Why the best AI agents are just loops that call functions.
7 August 2025
Read
Five hard-learned lessons about AI evals
What our customers have taught us about running evals at scale.
17 July 2025
Read
Braintrust is not an eval framework
Why we built infrastructure for AI products, not just another evaluation tool.
14 July 2025
Read
Building with Grok 4
xAI recently announced Grok 4. We put it to the ultimate test.
11 July 2025
Read
Experiments UI: Now 10x faster
Brainstore speeds up experiments, datasets, and logs.
3 June 2025
Read
Eval playgrounds for faster, focused iteration
Run full evals directly in a powerful editor UI.
27 May 2025
Read
How Coursera builds next-generation learning tools
Key learnings from the Coursera AI engineering team.
12 May 2025
Read
Webinar recap: Eval best practices
A recap of our technical Q&A hosted by CEO Ankur Goyal.
22 April 2025
Read
Resilient observability by design
How we built Braintrust to ensure no impact on downtime.
3 April 2025
Read
Brainstore is now on by default
Brainstore is now the default in both our UI and API. Learn what's changing and coming next.
31 March 2025
Read
Brainstore: the database designed for the AI engineering era
LLM observability, now 80x faster.
3 March 2025
Read
Bedrock, Vertex AI, and universal structured outputs
Full support for Bedrock, Vertex AI, and structured outputs in the AI proxy and playground.
14 February 2025
Read
How Fintool generates millions of financial insights
Learn to build trusted and scalable LLM apps from the team at Fintool.
31 January 2025
Read
How Loom auto-generates video titles
Learn scoring best practices from the software engineering team at Loom.
27 January 2025
Read
Evaluating agents
Learn best practices for scoring agentic systems.
22 January 2025
Read
Our approach to hybrid deployment
The easiest way to self-host Braintrust.
8 January 2025
Read
The top 10 most loved features of 2024
Our year in review.
31 December 2024
Read
New monitor page for easy analytics
More visibility into performance across logs and experiments.
18 December 2024
Read
What to do when a new AI model comes out
How to decide if you should use a new model in production.
4 December 2024
Read
Building a RAG app with MongoDB Atlas
How to iterate on AI applications without redeploying code.
18 November 2024
Read
Evaluating Gemini models for vision
Faster, more efficient, and highly accurate for real-world applications.
14 November 2024
Read
Python tool functions: powered by uv
How we used the uv library to build Python tools.
13 November 2024
Read
Building serverless apps with the OpenAI Realtime API
No server setup or configuration necessary.
4 November 2024
Read
Logging with attachments
Observability for advanced AI applications.
24 October 2024
Read
I ran an eval. Now what?
A guide to next steps after your first eval and best practices for your workflows.
17 October 2024
Read
How Notion develops world-class AI features
Learn how Notion refined their development workflow with Braintrust.
9 October 2024
Read
Announcing our $36M Series A
We’re thrilled to announce that we've raised $36 million to advance the future of AI software engineering, bringing our total funding to $45 million.
8 October 2024
Read
Functions: flexible AI engineering primitives
Introducing functions, a general-purpose primitive for building, evaluating, and observing AI products.
8 October 2024
Read
Custom scoring functions in the Braintrust Playground
Create custom scorers and access them via the Braintrust UI and API.
16 September 2024
Read
Braintrust achieves SOC 2 Type II compliance
We are excited to announce that Braintrust has achieved SOC 2 Type II compliance.
15 July 2024
Read
How to improve your evaluations
Learn how to improve your evals by identifying new evaluators, iterating on existing scorers, and adding new test cases.
2024-06-20
Read
How Zapier builds production-ready AI products
Zapier was one of the earliest adopters of GenAI. In this post, we share insights from Mike Knoop, Co-founder & Head of AI at Zapier.
2024-05-30
Read
AI development loops
Key activities that enable fast feedback and clear signal when developing AI features.
6 May 2024
Read
Getting started with automated evaluations
Three actionable approaches for engineering teams to get started with automated evaluations.
2024-04-24
Read
Eval feedback loops
Learn how to build robust eval feedback loops for AI products by connecting real-world log data to your evals. Discover best practices for structuring evals, flowing production logs into eval datasets, and using Braintrust to streamline the process.
17 April 2024
Read
Braintrust selected to be in the Enterprise Tech 30
The Enterprise Tech 30 by Wing Venture Capital names the highest potential private companies in enterprise technology.
9 April 2024
Read
How Hostinger evaluates AI applications with Braintrust
Liucija, Senior Data Scientist on the AI team @ Hostinger, provides an overview of how she leverages Braintrust to accelerate Hostinger's AI development process and automate over 40% of customer support chat conversations.
27 February 2024
Read
2023, a year in review
Check out your Braintrust 2023 year in review to see how you did this year!
21 December 2023
Read
Braintrust's seed round: $5m to build infrastructure for AI products
Announcing Braintrust's seed round led by Greylock. The round builds on our early traction with customers like Zapier, Coda, Airtable, and Instacart and allows us to accelerate our vision of building world-class infrastructure for AI products. We are hiring for a number of roles, so please check out our careers page if you are interested in joining us.
13 December 2023
Read
Open sourcing the AI proxy
The Braintrust AI Proxy is now open source! We also added support for Azure OpenAI and provider load balancing.
27 November 2023
Read
AI proxy: fostering a more open ecosystem
Introducing Braintrust's latest feature: an AI proxy that lets you use open source models like LLaMa 2 and Mistral, as well as all of OpenAI's and Anthropic's models, behind a single interface with caching, security, and API key management built in.
20 November 2023
Read
State of AI development 2023
Retool recently surveyed over 1,500 workers and how their companies are adopting AI in their State of AI 2023 report. Here's what they are struggling with and how Braintrust can help them.
15 November 2023
Read
The AI product development journey
Building reliable AI apps is hard. It’s easy to build a cool demo but hard to build an AI app that works in production for real users. In traditional software development, there’s a set of best practices like setting up CI/CD and writing tests to make your software robust and easy to build on. But, with LLM apps it’s not obvious how to create these tests or processes.
13 November 2023
Read
Weekly update 11/13/23
Function calling and tool support, new blog posts, and project UI improvements.
13 November 2023
Read
Weekly update 11/06/23
Perplexity models support, new OpenAI models, reworked diff selector in experiment view.
06 November 2023
Read
Weekly update 10/30/23
Resizable sidebar, new help tooltips, performance optimizations, Replit.
30 October 2023
Read
Weekly update 10/23/23
Auto input variables in the playground, duration metrics, performance optimizations, partner releases.
23 October 2023
Read
Weekly update 10/16/23
Tracing, experiment dashboard customization, text-block prompts, bigger tables, new eval docs.
16 October 2023
Read
Weekly update 10/09/23
Performance improvements, fine tuning tutorial, Alpaca Evals, autocomplete in the playground.
09 October 2023
Read
It's time to build reliable AI
Introducing Braintrust: the enterprise-grade stack for building AI products. From evaluations, to prompt playground, to data management, we take uncertainty and tedium out of incorporating AI into your business.
12 September 2023