Encyclopedia Evalica / Evaluation / Release criteria

Release criteria

/rih'lees kreye'tih.ree.uh/The explicit conditions that must be met before an AI system change is deployed to production. Criteria are often defined as minimum scores across key eval dimensions. (noun)

Our release criteria require groundedness > 0.8 and no safety regressions.

Related Evaluation terms

From the docs

Get started with Evals

Braintrust is the AI observability and eval platform for production AI. By connecting evals and observability in one workflow, teams at Notion, Stripe, Zapier, Vercel, and Ramp ship quality AI products at scale.

Start building