Summary

Goal: Configure human review scorers to write multiple question-answer pairs as an array in the expected field. Features: Human review scorers, expected field configuration, dataset UI refresh.

Configuration Steps

Step 1: Configure human review scorer to write to expected field

Set each human review scorer to append its results to the expected field in the scorer configuration.

Navigate to your project settings
Go to Human review scores configuration
For each scorer, set the output destination to expected field

Step 2: Create multiple human review scorers

Define separate scorers for each question you want to collect during human review.

# Example scorer names:
# - CQCopyAndTone
# - CQFormatAndAesthetics
# - CQImageGenerationText

Step 3: Conduct human review

Complete the human review for each scorer, which will append results to the expected field as an array.

Step 4: Refresh page to see updated scores

Manually refresh the dataset or experiment page to view the updated expected field with all human review scores. Note: Automatic UI refresh after human review completion is tracked as a feature request.

Expected Result

The expected field will contain an array with multiple question-answer pairs:

{
    "expected": {
        "CopyAndTone": 0.75,
        "FormatAndAesthetics": 0.75,
        "ImageGenerationText": 1,
        ...
    }
}

Known Limitations

Dataset UI does not automatically refresh after human review scoring completion
Manual page refresh required to see updated scores
Automatic refresh is being tracked as an internal feature request

Human review visibility in experiment row view

Internal Server Error: No Secrets Match ORG_NAME in

⌘I

​Summary

​Configuration Steps

​Step 1: Configure human review scorer to write to expected field

​Step 2: Create multiple human review scorers

​Step 3: Conduct human review

​Step 4: Refresh page to see updated scores

​Expected Result

​Known Limitations

​Related Documentation