Eval() function, or use the braintrust eval CLI command to run multiple evaluations from files. Integrate with CI/CD to catch regressions automatically.
Run with Eval()
TheEval() function runs an evaluation and creates an experiment:
Eval() automatically:
- Creates an experiment in Braintrust
- Displays a summary in your terminal
- Populates the UI with results
- Returns summary metrics
Run with CLI
Use thebraintrust eval command to run evaluations from files:
- TypeScript
- Python
.env.development.local.env.local.env.development.env
Watch mode
Re-run evaluations automatically when files change:Local testing mode
Run evaluations without sending logs to Braintrust for quick iteration:Run in CI/CD
Integrate evaluations into your CI/CD pipeline to catch regressions automatically.GitHub Actions
Use thebraintrustdata/eval-action to run evaluations on every pull request:

Other CI systems
For other CI systems, run evaluations as a standard command:BRAINTRUST_API_KEY environment variable set.
Run remotely
Expose evaluations running on remote servers or local machines using dev mode:Configure experiments
Customize experiment behavior with options:Run trials
Run each input multiple times to measure variance and get more robust scores. Braintrust intelligently aggregates results by bucketing test cases with the sameinput value:
Run local evals without sending logs
Run evaluations locally without creating experiments or sending data to Braintrust:--no-send-logs flag with the CLI command.
Next steps
- Interpret results from your experiments
- Compare experiments to measure improvements
- Write scorers to measure quality
- Use playgrounds for no-code experimentation