Expose remote Eval
To expose an Eval running at a remote URL or your local machine, pass in the --dev flag.
- TypeScript
- Python
Run
npx braintrust eval path/to/eval.ts --dev to start the dev server and expose http://localhost:8300.--dev-host.
--dev-host DEV_HOST: The host to bind the dev server to. Defaults to localhost. Set to 0.0.0.0 to bind to all interfaces.--dev-port DEV_PORT: The port to bind the dev server to. Defaults to 8300.
Run a remote eval from a playground
To run a remote eval from a playground, add a new task and select Remote eval from the list. Choose from the evals exposed in localhost or remote sources.Configure remote eval sources
To configure remote eval source URLs for a project, navigate to Configuration > Remote evals. Then, select + Remote eval source to configure a new remote eval source for your project.
Language considerations
When implementing remote evals, be aware of these language-specific patterns:| Feature | TypeScript | Python |
|---|---|---|
| Parameter validation | Zod schemas (e.g., z.string(), z.boolean()) | Optional: Pydantic models with single value field |
| Parameter access | Direct access (e.g., parameters.prefix) | Dictionary access (e.g., parameters["prefix"] or parameters.get("prefix")) |
| Validation approach | Automatic via Zod | Optional via Pydantic validators or manual in task |
| Prompt format | messages array | Nested prompt and options objects |
| Async handling | async/await with promises | async/await with coroutines |
Limitations
- The dataset defined in your remote eval will be ignored. Scorers defined in remote evals will be concatenated with playground scorers.
- Remote evals are supported in both TypeScript and Python.