Encyclopedia Evalica / Evaluation / Confidence interval

Confidence interval illustration

Confidence interval

/'kah.nfuh.duhns 'ih.nter.vuhl/A statistical range around an estimate (e.g., mean score) that communicates uncertainty due to sample size/variance. Confidence intervals help prevent overreacting to noise. (noun)

The confidence interval overlapped, so the 'improvement' wasn't statistically meaningful.

Related Evaluation terms

From the docs

Get started with Evals

Braintrust is the AI observability and eval platform for production AI. By connecting evals and observability in one workflow, teams at Notion, Stripe, Zapier, Vercel, and Ramp ship quality AI products at scale.

Start building