Skip to main content
Share evaluation results publicly using status pages. Status pages display aggregate scores and metrics in a branded dashboard without exposing individual traces or sensitive data. Configure which scores to show, group by metadata fields, and share a public URL. Status pages work well for:
  • Customer transparency: Show evaluation results to customers or partners
  • Team alignment: Give cross-functional teams visibility into quality metrics
  • Public benchmarks: Publish performance comparisons for open-source projects
  • CI/CD dashboards: Display test results in a clean, accessible format
Each project can have one active status page.

Create a status page

  1. Go to Experiments in your project.
  2. Click Publish eval status page in the experiments page header.
  3. In the publish dialog, configure how your evaluation results will be displayed:
    • Page title (required): The heading shown at the top of your status page.
    • Description: Optional markdown-formatted text to provide context.
    • Logo URL: Optional custom logo image (displays at the top of the page).
    • Grouping field (required): Metadata field to group experiments by (typically metadata.model).
    • Filter: Optional filtering to include only specific experiments.
    • Score columns: Select which scores to display from your evaluations.
    • Metric columns: Select which metrics to show (duration, tokens, error rate, etc.).
    • Sort by: Optional sort by a specific score or metric.
    • Theme: Choose between light or dark mode.
    The dialog shows a live preview of your status page as you configure it.
  4. Click Publish to make your status page available at a public URL. On first publish, Braintrust automatically creates a service account with read-only access to experiments in this project. This account is used to fetch data for the public page without exposing your API credentials.

Configure display options

Grouping field

The grouping field determines how experiments are organized on your status page. Experiments are grouped by the value of this metadata field, with each group displayed as a column. Common grouping fields:
  • metadata.model: Compare results across different models
  • metadata.version: Track performance across application versions
  • metadata.prompt: Compare different prompt variations
  • metadata.dataset: Show results for different test scenarios
Example: If you group by metadata.model and have experiments with gpt-5-mini, gpt-5-nano, and claude-sonnet-4, your status page will show three columns, one for each model.

Filters

Click + Filter and use the Basic tab for point-and-click filtering or switch to SQL to write precise queries. For example:
  • Focusing on recent evaluations: created > '2026-01-01'
  • Including specific datasets: dataset_name = 'production-sample'
  • Filtering by metadata: metadata.environment = 'staging'

Score columns

Select which evaluation scores to display. Scores appear as rows with color-coded progress bars:
  • Green (0.7-1.0): High scores
  • Yellow (0.4-0.7): Medium scores
  • Red (0.0-0.4): Low scores
Each cell shows the average score across all experiments in that group. The row header displays the overall average across all groups.

Metric columns

Display built-in or custom metrics from your experiments:
  • Built-in metrics:
    • Duration: Average end-to-end execution time (seconds)
    • LLM duration: Average LLM call duration (seconds)
    • Prompt tokens: Average prompt tokens per example
    • Completion tokens: Average completion tokens per example
    • Total tokens: Total tokens (prompt + completion)
    • Examples: Number of examples evaluated
    • Error rate: Percentage of failed examples
  • Custom metrics defined in your experiments are also available for selection.

Share a status page

Once published, a status page is publicly accessible. Share the URL with anyone:
https://www.braintrust.dev/status/{org}/{project}
Status pages are:
  • Public: Accessible to anyone with the URL (no authentication required)
  • Read-only: Viewers cannot modify experiments or access sensitive data
  • Aggregate-only: Shows averaged scores and metrics, not individual test cases
  • Secure: Uses a service account with minimal permissions (experiment read-only)

Update a status page

To modify a status page configuration:
  1. Go to Experiments in your project.
  2. Click Update eval status page in the experiments page header.
  3. Make changes to any configuration options.
  4. Preview your updates in real-time.
  5. Click Update to publish changes.
Updates are reflected immediately on the public URL.

Unpublish a status page

To remove your status page:
  1. Go to Experiments in your project.
  2. Click Update eval status page in the experiments page header.
  3. Click Unpublish
  4. Confirm the removal.
The public URL will no longer be accessible. The service account and its permissions remain for future republishing.

Next steps