Skip to main content
Once you’ve configured review scores, reviewers record judgments row by row in review mode. This page covers starting a review, setting scores, and moving through rows efficiently.

Start reviewing

  1. Go to Review and select the type of data to review:
    • Log spans: production traces and debugging sessions.
    • Experiment spans: evaluation results and test runs.
    • Dataset rows: test cases and examples.
  2. Select a row and set scores. You can also add comments and tags while reviewing.
  3. Click Mark complete to record your review.
By default, marking a review complete keeps you on the current row. To move to the next row automatically, enable the Auto-advance toggle. The button then reads Mark complete and continue. You can also navigate with the Next row and Previous row buttons.
You can also enter review mode directly from the Logs table or an individual experiment table in Experiments using the Review button or the r shortcut. The review will be scoped to the set of traces in the current filter. See Review specific rows in Logs and Experiments.
Not all score types appear on dataset rows. Only categorical scores configured to “write to expected” and free-form scores are available for dataset reviews, since datasets store test data (input/expected pairs) rather than subjective quality assessments.
When the same span is scored by more than one person, see Review with multiple reviewers for how Braintrust combines their scores.

Change the trace layout

While reviewing log and experiment traces, you see detailed information about the flagged span by default. To switch between hierarchy, timeline, thread, and other layouts, see Examine traces. When the raw trace is hard to read, build a custom view that renders each span as a purpose-built annotation interface. This is especially useful for large-scale review and for subject-matter experts who shouldn’t have to parse JSON to score accurately.

Next steps