- SDK
- UI
Set up your environment and create an eval with the Braintrust SDK. Wrappers are available for TypeScript, Python, and other languages.Create an evaluation script:Run your evaluation:This will create an experiment in Braintrust. Once the command runs, you’ll see a link to your experiment.

1. Install Braintrust libraries
Install the Braintrust SDK and autoevals library for your language:2. Configure an API key
You need a Braintrust API key to authenticate your evaluation.Create an API key in the Braintrust UI and then add the key to your environment:3. Run an evaluation
A Braintrust evaluation is a simple function composed of a dataset of user inputs, a task, and a set of scorers.In addition to adding each data point inline when you call the
Eval() function, you can also pass an existing or new dataset directly.tutorial.eval.ts
4. View your results
Congrats, you just ran an eval! You should see a dashboard like this when you load your experiment. This view is called the experiment view, and as you use Braintrust, we hope it becomes your trusty companion each time you change your code and want to run an eval.The experiment view allows you to look at high level metrics for performance, dig into individual examples, and compare your LLM app’s performance over time.
5. Run another experiment
After running your first evaluation, you’ll see that we achieved a 77.8% score. Can you adjust the evaluation to improve this score? Make your changes and re-run the evaluation to track your progress.
Next steps
- Dig into our experiments guide to learn more about how to run evals.
- Look at our cookbook to learn how to evaluate RAG, summarization, text-to-sql, and other popular use cases.
- Learn how to log traces to Braintrust.
- Read about Braintrust’s platform and architecture.




