> ## Documentation Index
> Fetch the complete documentation index at: https://braintrust.dev/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Customizing experiment names in the SDK

export const plans_0 = "Any"

export const deployments_0 = "Any"

export const data_plane_version_0 = undefined

export const use_case_0 = "Use case - Users who want to set a custom or dynamic name for an experiment run via the SDK rather than accepting the default auto-generated name"

<Note>
  **Applies to:**

  * Plan - {plans_0}
  * Deployment - {deployments_0}
  * {data_plane_version_0}
  * {use_case_0}
</Note>

## Summary

**Issue:** The `Name` column in experiment results always shows `"eval"` and cannot be overridden through `Eval()` or `EvalCase` parameters.

**Cause:** The SDK hardcodes `name="eval"` when creating the root span for each row in `Eval()`.

**Resolution:** Use `tags` to differentiate rows, or apply a monkey-patch workaround for full `Name` column control.

***

## Resolution steps

### Option 1: Use tags (recommended)

Add a `tags` field to each `EvalCase` to identify rows without changing the `Name` column.

```python theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
EvalCase(
    input="my input",
    expected="my output",
    tags=["my-custom-label"]
)
```

### Option 2: Monkey-patch `start_span` (unsupported)

This overrides the hardcoded `name="eval"` per row. Use with caution — this is not officially supported and may break with SDK updates.

```python theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
"""
Demonstrates a monkey-patch to set per-row span names in the Braintrust
eval framework, so the "Name" column in the experiment UI shows a
different value for each row instead of the hardcoded "eval".

The key insight: when the framework calls experiment.start_span(),
it passes the row's `input` and `expected` in the kwargs, so we can
derive a name from the data itself.
"""

import braintrust.framework as _fw

_orig_impl = _fw._run_evaluator_internal_impl

_name_fn = None


def set_eval_name_fn(fn):
    """Register a function that receives (input, expected) and returns a span name."""
    global _name_fn
    _name_fn = fn


async def _patched_impl(experiment, evaluator, *args, **kwargs):
    if experiment is not None:
        _orig_start = experiment.start_span

        def _patched_start(*a, **kw):
            if kw.get("name") == "eval" and _name_fn is not None:
                kw["name"] = "Custom name: " + _name_fn(
                    kw.get("input"),
                    kw.get("expected"),
                )
            return _orig_start(*a, **kw)

        experiment.start_span = _patched_start
    return await _orig_impl(experiment, evaluator, *args, **kwargs)


_fw._run_evaluator_internal_impl = _patched_impl

# ── Eval definition ──────────────────────────────────────────────────

from braintrust import Eval


def data():
    return [
        {"input": "What is 2+2?", "expected": "4"},
        {"input": "What is the capital of France?", "expected": "Paris"},
        {"input": "What color is the sky?", "expected": "Blue"},
    ]


def task(input, hooks):
    answers = {
        "What is 2+2?": "4",
        "What is the capital of France?": "Paris",
        "What color is the sky?": "Blue",
    }
    return answers.get(input, "I don't know")


def exact_match(input, output, expected):
    return output.strip().lower() == expected.strip().lower()


set_eval_name_fn(lambda input, expected: input[:40])

Eval(
    "pedro-project1",
    data=data,
    task=task,
    scores=[exact_match],
    experiment_name="per-row-name-demo",
)
```

***

## Notes

* Native experiment row name customization is not supported as of this writing.
* The `metadata` field on `EvalCase` is another option for per-row identification if `tags` do not meet your needs.