Encyclopedia Evalica / Evaluation / Calibration

Calibration illustration

Calibration

/ka.luh'bray.shuhn/Adjusting a judge/scorer or mapping scores to align with human judgment and improve consistency. Calibration helps scores remain trustworthy as rubrics and models change. (noun)

Calibration reduced disagreement between human reviewers and the LLM judge.

Related Evaluation terms

From the docs

Get started with Evals

Braintrust is the AI observability and eval platform for production AI. By connecting evals and observability in one workflow, teams at Notion, Stripe, Zapier, Vercel, and Ramp ship quality AI products at scale.

Start building