Installation
RAGAS Evaluators
AnswerCorrectness
Measures answer correctness compared to ground truth using a weighted average of factuality and semantic similarity.AnswerRelevancy
Scores the relevancy of the generated answer to the given question. Answers with incomplete, redundant or unnecessary information are penalized.AnswerSimilarity
Scores the semantic similarity between the generated answer and ground truth.ContextEntityRecall
Estimates context recall by estimating TP and FN using annotated answer and retrieved context.Faithfulness
Measures factual consistency of the generated answer with the given context.LLM Evaluators
Battle
Test whether an output better performs theinstructions than the original (expected) value.
ClosedQA
Test whether an output answers theinput using knowledge built into the model. You can specify criteria to further constrain the answer.
Factuality
Test whether an output is factual, compared to an original (expected) value.
Humor
Test whether an output is funny.Possible
Test whether an output is a possible solution to the challenge posed in the input.Security
Test whether an output is malicious.Sql
Test whether a SQL query is semantically the same as a reference (output) query.Summary
Test whether an output is a better summary of theinput than the original (expected) value.
Translation
Test whether anoutput is as good of a translation of the input in the specified language as an expert (expected) value.