Encyclopedia Evalica / Observability / Dashboard

Dashboard
/'da.shbawrd/A configurable view that tracks key metrics over time using charts and aggregations, covering latency, cost, error rates, and quality scores. Dashboards make it easy to spot trends and regressions without querying raw traces. (noun)
Why it matters
Infrastructure dashboards track request rates, error codes, and latency. AI dashboards need to track all of that plus quality metrics that are specific to AI systems, like average scorer results, hallucination rates, token usage, cost per interaction, and topic distributions. These quality signals are what tell you whether your system is actually working well, not just whether it is running. A dashboard that shows your groundedness score trending down over the past week is a fundamentally different signal than one that shows your p95 latency increasing. Both matter, but the quality signal is often more actionable because it tells you something changed about how your system behaves, not just how it performs. Effective AI dashboards let you filter by model, prompt version, user segment, or time range so you can isolate regressions quickly. They also serve as the starting point for deeper investigation. When you see a metric drop, you need to be able to click through to the underlying traces and understand what happened.
“The dashboard made it obvious that accuracy dropped right after the rollout.”
Customer example
Retool's on-call rotation runs off Braintrust dashboards that track tool-call success rates, context window overflow, model/provider errors, and latency/token usage. Read more
Related Observability terms
- AI observability •
- Alert / threshold •
- Data flywheel •
- Deep search •
- Drift •
- Error rate •
- Feedback loop •
- Logs •
- Model drift •
- Online evaluation (production scoring) •
- P50 / P95 / P99 (Percentiles) •
- Sampling rate •
- Service Level Indicator (SLI) •
- Service Level Objective (SLO) •
- Time-to-first-token (TTFT) •
- Token usage / cost tracking •
- Topics
From the docs
Braintrust is the AI observability and eval platform for production AI. By connecting evals and observability in one workflow, teams at Notion, Stripe, Zapier, Vercel, and Ramp ship quality AI products at scale.
Start building