API reference - Braintrust

The .NET SDK is in beta. APIs can change between minor versions.

This page covers the key APIs in the Braintrust .NET SDK. For setup, see the Quickstart. For the complete API, browse the source on GitHub. The .NET SDK does not support prompts.

Tracing

Tracing records what your application does as spans you can inspect in Braintrust. The .NET SDK is built on OpenTelemetry: you instrument AI calls by wrapping your provider clients (see Wrappers), and you trace your own application code with the ActivitySource Braintrust provides. The APIs below set up that pipeline and let you flush and link to your traces.

`Braintrust.Get`

Gets the global Braintrust instance, creating it on the first call and returning that same instance afterward. This is your entry point to the SDK: it reads configuration and sets up the OpenTelemetry pipeline that exports spans to Braintrust. Call Get() with no arguments to configure from environment variables, which is the most common setup. To configure it in code, build a BraintrustConfig and pass it to Get(config) (see Configuration).

#skip-compile

using Braintrust.Sdk;

var braintrust = Braintrust.Sdk.Braintrust.Get();

Returns: Braintrust Overloads:

Get(): reads BraintrustConfig.FromEnvironment() and auto-manages OpenTelemetry.
Get(BraintrustConfig config, bool autoManageOpenTelemetry = true): uses the supplied config. Set autoManageOpenTelemetry to false to attach Braintrust to an OpenTelemetry pipeline you manage yourself.
Of(BraintrustConfig config, bool autoManageOpenTelemetry = true): creates a new, non-global instance.

`Braintrust.GetActivitySource`

Returns the System.Diagnostics.ActivitySource Braintrust uses to create spans. Pass it to the provider wrap methods, and use it to wrap your own application code so traced AI calls nest under your operations.

#skip-compile

var activitySource = braintrust.GetActivitySource();

Returns: System.Diagnostics.ActivitySource Braintrust traces wrapped provider calls, but not the code around them, so wrapping that work in a span shows its structure in the trace and nests any traced AI calls underneath.

#skip-compile

var activitySource = braintrust.GetActivitySource();

using (var activity = activitySource.StartActivity("process-request"))
{
    activity?.SetTag("user.id", userId);
    var output = await AnswerQuestion("Write a haiku about evals");
    activity?.SetTag("app.output", output);
}

`Braintrust.GetProjectUriAsync`

Builds the Braintrust UI URL for the configured organization and project. Use it to link from your own app or logs straight to the project in Braintrust.

#skip-compile

var uri = await braintrust.GetProjectUriAsync();

Returns: Task<Uri>

`BraintrustTracing.ForceFlush`

Forces any pending spans to be sent to Braintrust, returning once they’re flushed or the timeout elapses. You usually don’t need this. When the SDK manages OpenTelemetry (the default), it registers a process-exit hook that flushes on termination, so even short-lived processes export correctly with no extra work. Call ForceFlush only when you opt out of auto-management (autoManageOpenTelemetry: false) and wire Braintrust into your own tracer provider, where that hook isn’t registered.

#skip-compile

using Braintrust.Sdk.Trace;

BraintrustTracing.ForceFlush();

Returns: bool (true when the flush completes within the timeout, or when no tracer provider exists) Parameters:

timeoutMilliseconds (int): maximum time to wait for the flush. Defaults to 10000.

Evaluations

An evaluation runs your task over a set of cases, scores each output, and reports the results, which is how you measure quality and catch regressions as you change prompts or models. These APIs build and run evaluations from your C# code.

`Braintrust.EvalBuilder`

Defines an evaluation in code for input type TInput and output type TOutput: you give it cases, a task function, and one or more scorers, then call BuildAsync() to get an Eval. Call RunAsync() on it to run the task over every case, score the outputs, and log an experiment.

#skip-compile

using Braintrust.Sdk.Eval;

var eval = await braintrust
    .EvalBuilder<string, string>()
    .Name("food-classifier")
    .Tags("dotnet-sdk")
    .Metadata(new Dictionary<string, object> { { "model", "gpt-4o-mini" } })
    .Cases(
        DatasetCase.Of("strawberry", "fruit"),
        DatasetCase.Of("asparagus", "vegetable")
    )
    .TaskFunction(GetFoodType)
    .Scorers(new FunctionScorer<string, string>("exact_match",
        (expected, actual) => expected == actual ? 1.0 : 0.0))
    .BuildAsync();

var result = await eval.RunAsync();
Console.WriteLine(result.CreateReportString());

Returns: Eval<TInput, TOutput>.Builder Builder methods (each returns the Builder for chaining, except BuildAsync()):

Name(string) → Builder: experiment name.
Cases(params DatasetCase[]) → Builder: inline evaluation cases. Provide this or Dataset().
Dataset(IDataset) → Builder: run against an in-memory dataset instead of inline Cases.
TaskFunction(Func<TInput, TOutput>) → Builder (required): the task that produces the output to score. An async overload taking Func<TInput, Task<TOutput>> is also available.
Scorers(params IScorer[]) → Builder: scorers to apply to each case. Required unless Classifiers is set.
Classifiers(params IClassifier[]) → Builder: classifiers to apply to each case. Required unless Scorers is set.
Tags(params string[]) → Builder: experiment-level tags.
Metadata(IDictionary<string, object>) → Builder: experiment-level metadata.
MaxConcurrency(int?) → Builder: cap on how many cases run in parallel.
BuildAsync() → Task<Eval<TInput, TOutput>>: builds the eval.

Run the built eval with eval.RunAsync(), which returns an EvalResult. Call result.CreateReportString() for a human-readable CLI summary.

`IScorer<TInput, TOutput>`

Measures how good your task’s output is. Implement Score, which receives a TaskResult and returns one or more Score values. Implement the interface when you need full control over scoring.

#skip-compile

sealed class ExactMatchScorer : IScorer<string, string>
{
    public string Name => "exact_match";

    public Task<IReadOnlyList<Score>> Score(TaskResult<string, string> taskResult)
    {
        var isMatch = taskResult.DatasetCase.Expected == taskResult.Result;
        IReadOnlyList<Score> scores = [new Score(Name, isMatch ? 1.0 : 0.0)];
        return Task.FromResult(scores);
    }
}

`FunctionScorer<TInput, TOutput>`

Creates a scorer from a function, without defining a class. Reach for it when your scoring logic is simple, such as comparing the output to the expected value.

#skip-compile

var scorer = new FunctionScorer<string, string>(
    "accuracy",
    (expected, actual) => actual.Contains(expected) ? 1.0 : 0.0);

Constructors:

FunctionScorer(string name, Func<TOutput, TOutput, double> scorerFn): synchronous scorer, scoring from (expected, actual).
FunctionScorer(string name, Func<TOutput, TOutput, Task<double>> scorerFn): asynchronous scorer.

`IClassifier<TInput, TOutput>` and `FunctionClassifier<TInput, TOutput>`

Categorize a task’s output instead of scoring it numerically. A classifier returns structured Classification items (an id, plus optional label and metadata) rather than a number, which is useful for labeling outputs by topic, intent, or failure type. Implement IClassifier’s Classify method, or create one from a function with FunctionClassifier.

Datasets

A dataset is the set of cases an evaluation runs against. The .NET SDK supports in-memory datasets: define cases inline with DatasetCase.Of, and optionally group them into a dataset with Dataset.Of.

`DatasetCase.Of`

A dataset case is a single example: an input paired with the output you expect. Use Of(input, expected) for the common case, or the longer overloads to attach case-level tags and metadata.

#skip-compile

DatasetCase.Of("strawberry", "fruit");

DatasetCase.Of(
    "banana",
    "fruit",
    new List<string> { "tropical" },
    new Dictionary<string, object> { { "category", "tropical-fruit" } });

Returns: DatasetCase<TInput, TOutput> Overloads:

DatasetCase.Of(TInput input, TOutput expected): input and expected output.
DatasetCase.Of(TInput input, TOutput expected, IReadOnlyList<string> tags): adds case-level tags.
DatasetCase.Of(TInput input, TOutput expected, IReadOnlyList<string> tags, IReadOnlyDictionary<string, object> metadata): adds case-level tags and metadata.

`Dataset.Of`

Groups several cases into an in-memory dataset you can hand to an evaluation with the builder’s Dataset(...) method, as an alternative to passing cases inline. Build one with Dataset.Of(params DatasetCase[]).

#skip-compile

using Braintrust.Sdk.Eval;

var dataset = Dataset.Of(
    DatasetCase.Of("strawberry", "fruit"),
    DatasetCase.Of("asparagus", "vegetable"));

Returns: IDataset<TInput, TOutput>

Wrappers

The .NET SDK instruments AI provider clients by wrapping them. Each wrapper takes the ActivitySource from GetActivitySource() and returns an instrumented client whose calls are traced to Braintrust. For setup and the full list of supported integrations, see .NET SDK integrations.

`BraintrustOpenAI.WrapOpenAI`

Creates an instrumented OpenAIClient. Every call on the returned client is traced.

#skip-compile

using Braintrust.Sdk.OpenAI;

var client = BraintrustOpenAI.WrapOpenAI(activitySource, apiKey);

Returns: OpenAIClient Parameters:

activitySource (ActivitySource, required): source from GetActivitySource().
openAIApiKey (string, required): OpenAI API key.
options (OpenAIClientOptions): optional OpenAI client options. Defaults to null.

`WithBraintrust` (Anthropic)

Wraps an Anthropic client so every Messages.Create call (including streaming) emits a span. It’s an extension method, so you call it on the client itself.

#skip-compile

using Braintrust.Sdk.Anthropic;

var client = new AnthropicClient().WithBraintrust();

Returns: IAnthropicClient Overloads:

WithBraintrust(bool captureMessageContent = true): uses the global Braintrust instance.
WithBraintrust(ActivitySource activitySource, bool captureMessageContent = true): uses the supplied ActivitySource.

Set captureMessageContent to false to omit message content from spans.

`BraintrustAzureOpenAI.WrapAzureOpenAI`

Creates an instrumented AzureOpenAIClient, reusing the OpenAI instrumentation pipeline.

#skip-compile

using Braintrust.Sdk.AzureOpenAI;

var client = BraintrustAzureOpenAI.WrapAzureOpenAI(activitySource, endpoint, apiKey);

Returns: AzureOpenAIClient Overloads:

WrapAzureOpenAI(ActivitySource activitySource, Uri endpoint, string apiKey, ...): API key authentication.
WrapAzureOpenAI(ActivitySource activitySource, Uri endpoint, TokenCredential credential, ...): Microsoft Entra ID authentication.

You can also call .WithBraintrust(activitySource) on a client you already created.

Agent Framework tracing

Traces Microsoft Agent Framework agents. Add the middleware to a ChatClientBuilder, then wrap your agent.

#skip-compile

using Braintrust.Sdk.AgentFramework;

var chatClient = builder.UseBraintrustTracing(activitySource).Build();
var agent = new ChatClientAgent(chatClient, ...).WithBraintrustAgentTracing(activitySource);

Methods:

UseBraintrustTracing(activitySource) → ChatClientBuilder: traces both LLM calls and function calls.
UseBraintrustLLMTracing(activitySource) → ChatClientBuilder: traces LLM calls only.
UseBraintrustFunctionTracing(activitySource) → ChatClientBuilder: traces function (tool) calls only.
WithBraintrustAgentTracing(activitySource) → AIAgent: wraps an agent so each run is captured as a span.

The tracing methods accept a captureMessageContent flag, and function tracing accepts a captureToolArguments flag, to turn off content capture.

Configuration

Configure the SDK with environment variables, or programmatically with BraintrustConfig.Of, then pass the config to Braintrust.Get(config).

#skip-compile

using Braintrust.Sdk;
using Braintrust.Sdk.Config;

var config = BraintrustConfig.Of(
    ("BRAINTRUST_API_KEY", "your-api-key"),
    ("BRAINTRUST_DEFAULT_PROJECT_NAME", "My project"));

var braintrust = Braintrust.Sdk.Braintrust.Get(config);

Environment variables

BRAINTRUST_API_KEY (required): Braintrust API key. Also discoverable from a .env.braintrust file.
BRAINTRUST_DEFAULT_PROJECT_NAME: project that traced spans route to. Defaults to default-dotnet-project.
BRAINTRUST_DEFAULT_PROJECT_ID: project UUID. Takes precedence over the project name.
BRAINTRUST_API_URL: Braintrust API URL. Defaults to https://api.braintrust.dev.
BRAINTRUST_APP_URL: Braintrust app URL, used for permalinks. Defaults to https://www.braintrust.dev.
BRAINTRUST_DEBUG: enable debug logging. Defaults to false.
BRAINTRUST_ENABLE_TRACE_CONSOLE_LOG: print spans to the console. Defaults to false.
BRAINTRUST_REQUEST_TIMEOUT: request timeout in seconds. Defaults to 30.
BRAINTRUST_TRACES_PATH: OpenTelemetry traces endpoint path. Defaults to /otel/v1/traces.
BRAINTRUST_LOGS_PATH: OpenTelemetry logs endpoint path. Defaults to /otel/v1/logs.

​Tracing

​Braintrust.Get

​Braintrust.GetActivitySource

​Braintrust.GetProjectUriAsync

​BraintrustTracing.ForceFlush

​Evaluations

​Braintrust.EvalBuilder

​IScorer<TInput, TOutput>

​FunctionScorer<TInput, TOutput>

​IClassifier<TInput, TOutput> and FunctionClassifier<TInput, TOutput>

​Datasets

​DatasetCase.Of

​Dataset.Of

​Wrappers

​BraintrustOpenAI.WrapOpenAI

​WithBraintrust (Anthropic)

​BraintrustAzureOpenAI.WrapAzureOpenAI

​Agent Framework tracing

​Configuration

​Environment variables

Tracing

`Braintrust.Get`

`Braintrust.GetActivitySource`

`Braintrust.GetProjectUriAsync`

`BraintrustTracing.ForceFlush`

Evaluations

`Braintrust.EvalBuilder`

`IScorer<TInput, TOutput>`

`FunctionScorer<TInput, TOutput>`

`IClassifier<TInput, TOutput>` and `FunctionClassifier<TInput, TOutput>`

Datasets

`DatasetCase.Of`

`Dataset.Of`

Wrappers

`BraintrustOpenAI.WrapOpenAI`

`WithBraintrust` (Anthropic)

`BraintrustAzureOpenAI.WrapAzureOpenAI`

Agent Framework tracing

Configuration

Environment variables