ABCDEFGHIJKLMNOPQRSTUVWXYZ

Encyclopedia Evalica / Evaluation / Annotation schema

Annotation schema

/a.nuh'tay.shuhn 'skee.muh/The structured set of labels/fields reviewers use when annotating traces or dataset records. A good schema increases consistency and makes downstream analysis easier. (noun)

“We revised the annotation schema to separate 'ungrounded' from 'incorrect'.”

Customer example

Portola's SME-driven review process uses a structured annotation schema so feedback stays consistent across reviewers. Read more

Related Evaluation terms

Absolute scoring

•

Agent

•

AI eval

•

Alignment

•

Baseline

•

Baseline experiment

•

Benchmark

•

Calibration

•

CI/CD integration

•

Coherence

•

Confidence interval

•

Eval harness

•

Eval leakage

•

Experiment

•

Factuality

•

Failure mode

•

Faithfulness

•

Feedback signal

•

Groundedness

•

Hallucination

•

Inter-annotator agreement (IAA)

•

LLM-as-a-judge

•

Loop

•

Model comparison

•

Multimodal

•

Non-determinism

•

Offline evaluation

•

Pairwise evaluation

•

Pass@k

•

Playground

•

Quality gate

•

RAG (retrieval-augmented generation)

•

RAG evaluation

•

Reference-based scoring

•

Reference-free scoring

•

Regression testing

•

Release criteria

•

Remote evaluation

•

Rubric

•

Safety

•

Score distribution

•

Scorer

•

Semantic failure

•

Signal-to-noise ratio

•

Task (eval task)

•

Toxicity score

From the docs

Evaluate systematically

•

Create experiments

•

Create scorers

•

BTQL POST endpoint payload and response schema

•

Defining response schema in prompts.create()

Get started with Evals

Braintrust is the AI observability and eval platform for production AI. By connecting evals and observability in one workflow, teams at Notion, Stripe, Zapier, Vercel, and Ramp ship quality AI products at scale.