Remote eval dev mode fails with Pydantic parameters

Applies to:

Plan -
Deployment -

Summary

Issue: Running bt eval with --dev fails in the Braintrust UI — the remote eval server appears to start and /list returns 200, but the UI cannot connect or list evals when the Eval() call includes a parameters argument using Pydantic models. Cause: A bug in the bt CLI causes Pydantic model parameter schemas to be serialized incorrectly during remote eval dev mode, preventing the UI from loading the eval. Resolution: Remove the parameters field from your Eval() call as a temporary workaround until the CLI fix is released.

Resolution steps

If your `Eval()` uses Pydantic models as `parameters`

Step 1: Remove `parameters` from `Eval()`

Comment out or delete the parameters argument from each affected Eval() call.

# Before
Eval(
    PROJECT,
    data=init_dataset(project=PROJECT, name=DATASET),
    task=my_task,
    scores=[my_scorer],
    parameters={"thresholds": ThresholdParams},  # remove this
)

# After
Eval(
    PROJECT,
    data=init_dataset(project=PROJECT, name=DATASET),
    task=my_task,
    scores=[my_scorer],
)

Step 2: Restart the dev server

bt eval your_eval.py --dev --dev-org-name "YourOrg"

Confirm the UI can now connect and list evals.

Step 3: Restore `parameters` after the CLI fix ships

The serialization bug is fixed in bt v0.11.0. Update the CLI and re-add your parameters argument.

bt --version  # confirm v0.11.0 or later

​Summary

​Resolution steps

​If your Eval() uses Pydantic models as parameters

​Step 1: Remove parameters from Eval()

​Step 2: Restart the dev server

​Step 3: Restore parameters after the CLI fix ships

Summary

Resolution steps

If your `Eval()` uses Pydantic models as `parameters`

Step 1: Remove `parameters` from `Eval()`

Step 2: Restart the dev server

Step 3: Restore `parameters` after the CLI fix ships