Python SDK - Braintrust

For complete Python documentation, examples, and API reference, please see the Braintrust SDK on GitHub.

Installation

pip install braintrust

Functions

Eval

A function you can use to define an evaluator. This is a convenience wrapper around the Evaluator class.

name

str

The name of the evaluator. This corresponds to a project name in Braintrust.

data

EvalData[Input, Output]

Returns an iterator over the evaluation dataset. Each element of the iterator should be a EvalCase.

task

EvalTask[Input, Output]

Runs the evaluation task on a single input. The hooks object can be used to add metadata to the evaluation.

scores

Sequence[EvalScorer[Input, Output]]

A list of scorers to evaluate the results of the task. Each scorer can be a Scorer object or a function

experiment_name

Optional[str]

(Optional) Experiment name. If not specified, a name will be generated automatically.

trial_count

int

The number of times to run the evaluator per input. This is useful for evaluating applications that

metadata

Optional[Metadata]

(Optional) A dictionary with additional data about the test example, model outputs, or just about

is_public

bool

(Optional) Whether the experiment should be public. Defaults to false.

update

bool

reporter

Optional[ReporterDef[Input, Output, EvalReport]]

(Optional) A reporter that takes an evaluator and its result and returns a report.

timeout

Optional[float]

(Optional) The duration, in seconds, after which to time out the evaluation.

max_concurrency

Optional[int]

project_id

Optional[str]

(Optional) If specified, uses the given project ID instead of the evaluator’s name to identify the project.

base_experiment_name

Optional[str]

An optional experiment name to use as a base. If specified, the new experiment will be

base_experiment_id

Optional[str]

An optional experiment id to use as a base. If specified, the new experiment will be

git_metadata_settings

Optional[GitMetadataSettings]

Optional settings for collecting git metadata. By default, will collect all git metadata fields allowed in org-level settings.

git_metadata_settings.collect

Literal['all', 'none', 'some']

required

git_metadata_settings.fields

NotRequired[Sequence[Literal['commit', 'branch', 'tag', 'dirty', 'author_name', 'author_email', 'commit_message', 'commit_time', 'git_diff']]]

repo_info

Optional[RepoInfo]

Optionally explicitly specify the git metadata for this experiment. This takes precedence over git_metadata_settings if specified.

repo_info.commit

NotRequired[Optional[str]]

repo_info.branch

NotRequired[Optional[str]]

repo_info.tag

NotRequired[Optional[str]]

repo_info.dirty

NotRequired[Optional[bool]]

repo_info.author_name

NotRequired[Optional[str]]

repo_info.author_email

NotRequired[Optional[str]]

repo_info.commit_message

NotRequired[Optional[str]]

repo_info.commit_time

NotRequired[Optional[str]]

repo_info.git_diff

NotRequired[Optional[str]]

error_score_handler

Optional[ErrorScoreHandler]

Optionally supply a custom function to specifically handle score values when tasks or scoring functions have errored.

description

Optional[str]

An optional description for the experiment.

summarize_scores

bool

Whether to summarize the scores of the experiment after it has run.

no_send_logs

bool

Do not send logs to Braintrust. When True, the evaluation runs locally

parameters

Optional[EvalParameters]

A set of parameters that will be passed to the evaluator.

on_start

Optional[Callable[[ExperimentSummary], None]]

An optional callback that will be called when the evaluation starts. It receives the

stream

Optional[Callable[[SSEProgressEvent], None]]

A function that will be called with progress events, which can be used to

parent

Optional[str]

If specified, instead of creating a new experiment object, the Eval() will populate

state

Optional[BraintrustState]

Optional BraintrustState to use for the evaluation. If not specified, the global login state will be used.

Reporter

A function you can use to define a reporter. This is a convenience wrapper around the ReporterDef class.

name

str

The name of the reporter.

report_eval

Callable[[Evaluator[Input, Output], EvalResultWithSummary[Input, Output], bool, bool], Union[EvalReport, Awaitable[EvalReport]]]

return str(result.summary)

report_run

Callable[[List[EvalReport], bool, bool], Union[bool, Awaitable[bool]]]

return True

current_experiment

Returns the currently-active experiment (set by braintrust.init(...)). Returns None if no current experiment has been set.

current_logger

Returns the currently-active logger (set by braintrust.init_logger(...)). Returns None if no current logger has been set.

current_span

Return the currently-active span for logging (set by running a span under a context manager). If there is no active span, returns a no-op span object, which supports the same interface as spans but does no logging.

flush

Flush any pending rows to the server.

get_prompt_versions

Get the versions for a specific prompt.

project_id

str

The ID of the project to query

prompt_id

str

The ID of the prompt to get versions for

get_span_parent_object

Mainly for internal use. Return the parent object for starting a span in a global context. Applies precedence: current span > propagated parent string > experiment > logger.

parent

Optional[str]

state

Optional[BraintrustState]

init

project

Optional[str]

The name of the project to create the experiment in. Must specify at least one of project or project_id.

experiment

Optional[str]

The name of the experiment to create. If not specified, a name will be generated automatically.

description

Optional[str]

(Optional) An optional description of the experiment.

dataset

Optional['Dataset']

(Optional) A dataset to associate with the experiment. The dataset must be initialized with braintrust.init_dataset before passing

open

bool

If the experiment already exists, open it in read-only mode. Throws an error if the experiment does not already exist.

base_experiment

Optional[str]

An optional experiment name to use as a base. If specified, the new experiment will be summarized and compared to this experiment. Otherwise, it will pick an experiment by finding the closest ancestor on the default (e.g. main) branch.

is_public

bool

An optional parameter to control whether the experiment is publicly visible to anybody with the link or privately visible to only members of the organization. Defaults to private.

app_url

Optional[str]

The URL of the Braintrust App. Defaults to https://www.braintrust.dev.

api_key

Optional[str]

The API key to use. If the parameter is not specified, will try to use the BRAINTRUST_API_KEY environment variable. If no API

org_name

Optional[str]

(Optional) The name of a specific organization to connect to. This is useful if you belong to multiple.

metadata

Optional[Metadata]

(Optional) a dictionary with additional data about the test example, model outputs, or just about anything else that’s relevant, that you can use to help find and analyze examples later. For example, you could log the prompt, example’s id, or anything else that would be useful to slice/dice later. The values in metadata can be any JSON-serializable type, but its keys must be strings.

git_metadata_settings

Optional[GitMetadataSettings]

(Optional) Settings for collecting git metadata. By default, will collect all git metadata fields allowed in org-level settings.

git_metadata_settings.collect

Literal['all', 'none', 'some']

required

git_metadata_settings.fields

NotRequired[Sequence[Literal['commit', 'branch', 'tag', 'dirty', 'author_name', 'author_email', 'commit_message', 'commit_time', 'git_diff']]]

set_current

bool

If true (the default), set the global current-experiment to the newly-created one.

update

Optional[bool]

If the experiment already exists, continue logging to it. If it does not exist, creates the experiment with the specified arguments.

project_id

Optional[str]

The id of the project to create the experiment in. This takes precedence over project if specified.

base_experiment_id

Optional[str]

An optional experiment id to use as a base. If specified, the new experiment will be summarized and compared to this. This takes precedence over base_experiment if specified.

repo_info

Optional[RepoInfo]

(Optional) Explicitly specify the git metadata for this experiment. This takes precedence over git_metadata_settings if specified.

repo_info.commit

NotRequired[Optional[str]]

repo_info.branch

NotRequired[Optional[str]]

repo_info.tag

NotRequired[Optional[str]]

repo_info.dirty

NotRequired[Optional[bool]]

repo_info.author_name

NotRequired[Optional[str]]

repo_info.author_email

NotRequired[Optional[str]]

repo_info.commit_message

NotRequired[Optional[str]]

repo_info.commit_time

NotRequired[Optional[str]]

repo_info.git_diff

NotRequired[Optional[str]]

state

Optional[BraintrustState]

(Optional) A BraintrustState object to use. If not specified, will use the global state. This is for advanced use only.

init_dataset

Create a new dataset in a specified project. If the project does not exist, it will be created.

project

Optional[str]

name

Optional[str]

The name of the dataset to create. If not specified, a name will be generated automatically.

description

Optional[str]

An optional description of the dataset.

version

Optional[Union[str, int]]

An optional version of the dataset (to read). If not specified, the latest version will be used.

app_url

Optional[str]

The URL of the Braintrust App. Defaults to https://www.braintrust.dev.

api_key

Optional[str]

The API key to use. If the parameter is not specified, will try to use the BRAINTRUST_API_KEY environment variable. If no API

org_name

Optional[str]

(Optional) The name of a specific organization to connect to. This is useful if you belong to multiple.

project_id

Optional[str]

The id of the project to create the dataset in. This takes precedence over project if specified.

metadata

Optional[Metadata]

(Optional) a dictionary with additional data about the dataset. The values in metadata can be any JSON-serializable type, but its keys must be strings.

use_output

bool

(Deprecated) If True, records will be fetched from this dataset in the legacy format, with the “expected” field renamed to “output”. This option will be removed in a future version of Braintrust.

_internal_btql

Optional[Dict[str, Any]]

(Internal) If specified, the dataset will be created with the given BTQL filters.

state

Optional[BraintrustState]

(Internal) The Braintrust state to use. If not specified, will use the global state. For advanced use only.

init_experiment

Alias for init

args

Any

kwargs

Any

init_function

Creates a function that can be used as either a task or scorer in the Eval framework. When used as a task, it will invoke the specified Braintrust function with the input. When used as a scorer, it will invoke the function with the scorer arguments.

project_name

str

The name of the project containing the function.

slug

str

The slug of the function to invoke.

version

Optional[str]

Optional version of the function to use. Defaults to latest.

init_logger

Create a new logger in a specified project. If the project does not exist, it will be created.

project

Optional[str]

The name of the project to log into. If unspecified, will default to the Global project.

project_id

Optional[str]

The id of the project to log into. This takes precedence over project if specified.

async_flush

bool

If true (the default), log events will be batched and sent asynchronously in a background thread. If false, log events will be sent synchronously. Set to false in serverless environments.

app_url

Optional[str]

The URL of the Braintrust API. Defaults to https://www.braintrust.dev.

api_key

Optional[str]

The API key to use. If the parameter is not specified, will try to use the BRAINTRUST_API_KEY environment variable. If no API

org_name

Optional[str]

(Optional) The name of a specific organization to connect to. This is useful if you belong to multiple.

set_current

bool

If true (the default), set the global current-experiment to the newly-created one.

state

Optional[BraintrustState]

invoke

Invoke a Braintrust function, returning a BraintrustStream or the value as a plain Python object.

function_id

Optional[str]

The ID of the function to invoke.

version

Optional[str]

The version of the function to invoke.

prompt_session_id

Optional[str]

The ID of the prompt session to invoke the function from.

prompt_session_function_id

Optional[str]

The ID of the function in the prompt session to invoke.

project_name

Optional[str]

The name of the project containing the function to invoke.

slug

Optional[str]

The slug of the function to invoke.

global_function

Optional[str]

The name of the global function to invoke.

input

Any

The input to the function. This will be logged as the input field in the span.

messages

Optional[List[Any]]

Additional OpenAI-style messages to add to the prompt (only works for llm functions).

metadata

Optional[Dict[str, Any]]

Additional metadata to add to the span. This will be logged as the metadata field in the span.

load_prompt

Loads a prompt from the specified project.

project

Optional[str]

The name of the project to load the prompt from. Must specify at least one of project or project_id.

slug

Optional[str]

The slug of the prompt to load.

version

Optional[Union[str, int]]

An optional version of the prompt (to read). If not specified, the latest version will be used.

project_id

Optional[str]

The id of the project to load the prompt from. This takes precedence over project if specified.

Optional[str]

The id of a specific prompt to load. If specified, this takes precedence over all other parameters (project, slug, version).

defaults

Optional[Mapping[str, Any]]

(Optional) A dictionary of default values to use when rendering the prompt. Prompt values will override these defaults.

no_trace

bool

If true, do not include logging metadata for this prompt when build() is called.

environment

Optional[str]

The environment to load the prompt from. Cannot be used together with version.

app_url

Optional[str]

The URL of the Braintrust App. Defaults to https://www.braintrust.dev.

api_key

Optional[str]

The API key to use. If the parameter is not specified, will try to use the BRAINTRUST_API_KEY environment variable. If no API

org_name

Optional[str]

(Optional) The name of a specific organization to connect to. This is useful if you belong to multiple.

log

Log a single event to the current experiment. The event will be batched and uploaded behind the scenes.

event

Any

Log into Braintrust. This will prompt you for your API token, which you can find at https://www.braintrust.dev/app/token. This method is called automatically by init().

app_url

Optional[str]

The URL of the Braintrust App. Defaults to https://www.braintrust.dev.

api_key

Optional[str]

The API key to use. If the parameter is not specified, will try to use the BRAINTRUST_API_KEY environment variable. If no API

org_name

Optional[str]

(Optional) The name of a specific organization to connect to. This is useful if you belong to multiple.

Login again, even if you have already logged in (by default, this function will exit quickly if you have already logged in)

parent_context

Context manager to temporarily set the parent context for spans.

parent

Optional[str]

The parent string to set during the context

state

Optional[BraintrustState]

Optional BraintrustState to use. If not provided, uses the global state.

parse_stream

Parse a BraintrustStream into its final value.

stream

BraintrustStream

The BraintrustStream to parse.

patch_litellm

Patch LiteLLM to add Braintrust tracing.

permalink

Format a permalink to the Braintrust application for viewing the span represented by the provided slug.

slug

str

The identifier generated from Span.export.

org_name

Optional[str]

The org name to use. If not provided, the org name will be inferred from the global login state.

app_url

Optional[str]

The app URL to use. If not provided, the app URL will be inferred from the global login state.

prettify_params

Clean up parameters by filtering out NOT_GIVEN values and serializing response_format.

params

dict[str, Any]

serialize_response_format

Serialize response format for logging.

response_format

Any

set_http_adapter

Specify a custom HTTP adapter to use for all network requests. This is useful for setting custom retry policies, timeouts, etc. Braintrust uses the requests library, so the adapter should be an instance of requests.adapters.HTTPAdapter. Alternatively, consider sub-classing our RetryRequestExceptionsAdapter to get automatic retries on network-related exceptions.

adapter

HTTPAdapter

The adapter to use.

set_masking_function

Set a global masking function that will be applied to all logged data before sending to Braintrust. The masking function will be applied after records are merged but before they are sent to the backend.

masking_function

Optional[Callable[[Any], Any]]

A function that takes a JSON-serializable object and returns a masked version.

set_thread_pool_max_workers

Set the maximum number of threads to use for running evaluators. By default, this is the number of CPUs on the machine.

max_workers

Any

span_components_to_object_id

Utility function to resolve the object ID of a SpanComponentsV4 object. This function may trigger a login to braintrust if the object ID is encoded lazily.

components

SpanComponentsV4

start_span

Lower-level alternative to @traced for starting a span at the toplevel. It creates a span under the first active object (using the same precedence order as @traced), or if parent is specified, under the specified parent row, or returns a no-op span object.

name

Optional[str]

type

Optional[SpanTypeAttribute]

span_attributes

Optional[Union[SpanAttributes, Mapping[str, Any]]]

start_time

Optional[float]

set_current

Optional[bool]

parent

Optional[str]

propagated_event

Optional[Dict[str, Any]]

state

Optional[BraintrustState]

event

Any

summarize

Summarize the current experiment, including the scores (compared to the closest reference experiment) and metadata.

summarize_scores

bool

Whether to summarize the scores. If False, only the metadata will be returned.

comparison_experiment_id

Optional[str]

The experiment to compare against. If None, the most recent experiment on the comparison_commit will be used.

traced

Decorator to trace the wrapped function when used without parentheses.

update_span

Update a span using the output of span.export(). It is important that you only resume updating to a span once the original span has been fully written and flushed, since otherwise updates to the span may conflict with the original span.

exported

str

The output of span.export().

event

Any

wrap_anthropic

Wrap an Anthropic object (or AsyncAnthropic) to add tracing. If Braintrust is not configured, this is a no-op. If this is not an Anthropic object, this function is a no-op.

client

Any

wrap_litellm

Wrap the litellm module to add tracing. If Braintrust is not configured, nothing will be traced.

litellm_module

Any

The litellm module

wrap_openai

Wrap the openai module (pre v1) or OpenAI instance (post v1) to add tracing. If Braintrust is not configured, nothing will be traced. If this is not an OpenAI object, this function is a no-op.

openai

Any

The openai module or OpenAI object

Classes

AsyncResponseWrapper

Wrapper that properly preserves async context manager behavior for OpenAI responses. Methods __init__()

AsyncScorerLike

Protocol for asynchronous scorers that implement the eval_async interface. The framework will prefer this interface if available.

Attachment

Represents an attachment to be uploaded and the associated metadata. Methods __init__(), reference(), data(), upload(), debug_info()

BaseExperiment

Use this to specify that the dataset should actually be the data from a previous (base) experiment. If you do not specify a name, Braintrust will automatically figure out the best base experiment to use based on your git history (or fall back to timestamps). Properties

name

Optional[str]

BraintrustConsoleChunk

A console chunk from a Braintrust stream. Properties

message

str

stream

Literal['stderr', 'stdout']

type

Literal['console']

BraintrustErrorChunk

An error chunk from a Braintrust stream. Properties

data

str

type

Literal['error']

BraintrustInvokeError

An error that occurs during a Braintrust stream.

BraintrustJsonChunk

A chunk of JSON data from a Braintrust stream. Properties

data

str

type

Literal['json_delta']

BraintrustProgressChunk

A progress chunk from a Braintrust stream. Properties

data

str

object_type

str

format

str

output_type

str

name

str

event

Literal['json_delta', 'text_delta', 'reasoning_delta']

type

Literal['progress']

BraintrustStream

A Braintrust stream. This is a wrapper around a generator of BraintrustStreamChunk, with utility methods to make them easy to log and convert into various formats. Methods __init__(), copy(), final_value()

BraintrustTextChunk

A chunk of text data from a Braintrust stream. Properties

data

str

type

Literal['text_delta']

CodeFunction

A generic callable, with metadata. Properties

project

'Project'

handler

Callable[..., Any]

name

str

slug

str

type_

str

description

Optional[str]

parameters

Any

returns

Any

if_exists

Optional[IfExists]

metadata

Optional[Dict[str, Any]]

CodePrompt

A prompt defined in code, with metadata. Properties

project

'Project'

name

str

slug

str

prompt

PromptData

tool_functions

List[Union[CodeFunction, SavedFunctionId]]

description

Optional[str]

function_type

Optional[str]

if_exists

Optional[IfExists]

metadata

Optional[Dict[str, Any]]

Methods to_function_definition()

CompletionWrapper

Wrapper for LiteLLM completion functions with tracing support. Methods __init__(), completion()

DataSummary

Summary of a dataset’s data. Properties

new_records

int

total_records

int

Dataset

A dataset is a collection of records, such as model inputs and outputs, which represent data you can use to evaluate and fine-tune models. You can log production data to datasets, curate them with interesting examples, edit/delete records, and run evaluations against them. Methods __init__(), id(), name(), data(), project(), logging_state(), insert(), update(), delete(), summarize(), close(), flush()

DatasetSummary

Summary of a dataset’s scores and metadata. Properties

project_name

str

dataset_name

str

project_url

str

dataset_url

str

data_summary

Optional[DataSummary]

EmbeddingWrapper

Wrapper for LiteLLM embedding functions. Methods __init__(), embedding()

EvalCase

An evaluation case. This is a single input to the evaluation task, along with an optional expected output, metadata, and tags. Properties

input

Input

expected

Optional[Output]

metadata

Optional[Metadata]

EvalHooks

An object that can be used to add metadata to an evaluation. This is passed to the task function. Methods metadata(), expected(), span(), trial_index(), tags(), report_progress(), meta(), parameters()

EvalResult

The result of an evaluation. This includes the input, expected output, actual output, and metadata. Properties

input

Input

output

Output

scores

Dict[str, Optional[float]]

expected

Optional[Output]

metadata

Optional[Metadata]

EvalScorerArgs

Arguments passed to an evaluator scorer. This includes the input, expected output, actual output, and metadata. Properties

input

Input

output

Output

expected

Optional[Output]

metadata

Optional[Metadata]

Evaluator

An evaluator is an abstraction that defines an evaluation dataset, a task to run on the dataset, and a set of scorers to evaluate the results of the task. Each method attribute can be synchronous or asynchronous (for optimal performance, it is recommended to provide asynchronous implementations). Properties

project_name

str

eval_name

str

data

EvalData[Input, Output]

task

EvalTask[Input, Output]

scores

List[EvalScorer[Input, Output]]

experiment_name

Optional[str]

metadata

Optional[Metadata]

trial_count

int

is_public

bool

update

bool

timeout

Optional[float]

max_concurrency

Optional[int]

project_id

Optional[str]

base_experiment_name

Optional[str]

base_experiment_id

Optional[str]

git_metadata_settings

Optional[GitMetadataSettings]

repo_info

Optional[RepoInfo]

error_score_handler

Optional[ErrorScoreHandler]

description

Optional[str]

summarize_scores

bool

parameters

Optional[EvalParameters]

Experiment

An experiment is a collection of logged events, such as model inputs and outputs, which represent a snapshot of your application at a particular point in time. An experiment is meant to capture more than just the model you use, and includes the data you use to test, pre- and post- processing code, comparison metrics (scores), and any other metadata you want to include. Methods __init__(), id(), name(), data(), project(), logging_state(), log(), log_feedback(), start_span(), update_span(), fetch_base_experiment(), summarize(), export(), close(), flush()

ExperimentSummary

Summary of an experiment’s scores and metadata. Properties

project_name

str

project_id

Optional[str]

experiment_id

Optional[str]

experiment_name

str

project_url

Optional[str]

experiment_url

Optional[str]

comparison_experiment_name

Optional[str]

scores

Dict[str, ScoreSummary]

metrics

Dict[str, MetricSummary]

ExternalAttachment

Represents an attachment that resides in an external object store and the associated metadata. Methods __init__(), reference(), data(), upload(), debug_info()

JSONAttachment

A convenience class for creating attachments from JSON-serializable objects. Methods __init__()

LiteLLMWrapper

Main wrapper for the LiteLLM module. Methods __init__(), completion(), responses(), embedding(), moderation()

MetricSummary

Summary of a metric’s performance. Properties

name

str

metric

Union[float, int]

unit

str

improvements

Optional[int]

regressions

Optional[int]

diff

Optional[float]

ModerationWrapper

Wrapper for LiteLLM moderation functions. Methods __init__(), moderation()

NamedWrapper

Wrapper that preserves access to the original wrapped object’s attributes. Methods __init__()

Project

A handle to a Braintrust project. Methods __init__(), add_code_function(), add_prompt(), publish()

ProjectBuilder

Creates handles to Braintrust projects. Methods create()

Prompt

A prompt object consists of prompt text, a model, and model parameters (such as temperature), which can be used to generate completions or chat messages. The prompt object supports calling .build() which uses mustache templating to build the prompt with the given formatting options and returns a plain dictionary that includes the built prompt and arguments. The dictionary can be passed as kwargs to the OpenAI client or modified as you see fit. Methods __init__(), from_prompt_data(), id(), name(), slug(), prompt(), version(), options(), build()

PromptBuilder

Builder to create a prompt in Braintrust. Methods __init__(), create(), create(), create()

ReadonlyAttachment

A readonly alternative to Attachment, which can be used for fetching already-uploaded Attachments. Methods __init__(), data(), metadata(), status()

ReadonlyExperiment

A read-only view of an experiment, initialized by passing open=True to init(). Methods __init__(), id(), logging_state(), as_dataset()

RepoInfo

Information about the current HEAD of the repo. Properties

commit

Optional[str]

branch

Optional[str]

tag

Optional[str]

dirty

Optional[bool]

author_name

Optional[str]

author_email

Optional[str]

commit_message

Optional[str]

commit_time

Optional[str]

git_diff

Optional[str]

ReporterDef

A reporter takes an evaluator and its result and returns a report. Properties

name

str

report_eval

Callable[[Evaluator[Input, Output], EvalResultWithSummary[Input, Output], bool, bool], Union[EvalReport, Awaitable[EvalReport]]]

report_run

Callable[[List[EvalReport], bool, bool], Union[bool, Awaitable[bool]]]

ResponsesWrapper

Wrapper for LiteLLM responses functions with tracing support. Methods __init__(), responses()

RetryRequestExceptionsAdapter

An HTTP adapter that automatically retries requests on connection exceptions. Methods __init__(), send()

SSEProgressEvent

A progress event that can be reported during task execution, specifically for SSE (Server-Sent Events) streams. This is a subclass of TaskProgressEvent with additional fields for SSE-specific metadata. Properties

str

object_type

str

origin

ObjectReference

name

str

ScoreSummary

Summary of a score’s performance. Properties

name

str

score

float

improvements

Optional[int]

regressions

Optional[int]

diff

Optional[float]

ScorerBuilder

Builder to create a scorer in Braintrust. Methods __init__(), create(), create(), create(), create()

Span

A Span encapsulates logged data and metrics for a unit of work. This interface is shared by all span implementations. Methods id(), log(), log_feedback(), start_span(), export(), link(), permalink(), end(), flush(), close(), set_attributes(), set_current(), unset_current()

SpanIds

The three IDs that define a span’s position in the trace tree. Properties

span_id

str

root_span_id

str

span_parents

Optional[List[str]]

SpanImpl

Primary implementation of the Span interface. See the Span interface for full details on each method. Properties

can_set_current

bool

Methods __init__(), id(), set_attributes(), log(), log_internal(), log_feedback(), start_span(), end(), export(), link(), permalink(), close(), flush(), set_current(), unset_current()

SyncScorerLike

Protocol for synchronous scorers that implement the callable interface. This is the most common interface and is used when no async version is available. Methods __call__()

TaskProgressEvent

Progress event that can be reported during task execution. Properties

format

FunctionFormat

output_type

FunctionOutputType

event

Literal['reasoning_delta', 'text_delta', 'json_delta', 'error', 'console', 'start', 'done', 'progress']

data

str

ToolBuilder

Builder to create a tool in Braintrust. Methods __init__(), create()

TracedMessageStream

TracedMessageStream wraps both sync and async message streams. Obviously only one makes sense at a time Methods __init__()

Libraries

​Installation

​Functions

​Eval

​Reporter

​current_experiment

​current_logger

​current_span

​flush

​get_prompt_versions

​get_span_parent_object

​init

​init_dataset

​init_experiment

​init_function

​init_logger

​invoke

​load_prompt

​log

​login

​parent_context

​parse_stream

​patch_litellm

​permalink

​prettify_params

​serialize_response_format

​set_http_adapter

​set_masking_function

​set_thread_pool_max_workers

​span_components_to_object_id

​start_span

​summarize

​traced

​update_span

​wrap_anthropic

​wrap_litellm

​wrap_openai

​Classes

​AsyncResponseWrapper

​AsyncScorerLike

​Attachment

​BaseExperiment

​BraintrustConsoleChunk

​BraintrustErrorChunk

​BraintrustInvokeError

​BraintrustJsonChunk

​BraintrustProgressChunk

​BraintrustStream

​BraintrustTextChunk

​CodeFunction

​CodePrompt

​CompletionWrapper

​DataSummary

​Dataset

​DatasetSummary

​EmbeddingWrapper

​EvalCase

​EvalHooks

​EvalResult

​EvalScorerArgs

​Evaluator

​Experiment

​ExperimentSummary

​ExternalAttachment

​JSONAttachment

​LiteLLMWrapper

​MetricSummary

​ModerationWrapper

​NamedWrapper

​Project

​ProjectBuilder

​Prompt

​PromptBuilder

​ReadonlyAttachment

​ReadonlyExperiment

​RepoInfo

​ReporterDef

​ResponsesWrapper

​RetryRequestExceptionsAdapter

​SSEProgressEvent

Installation

Functions

Eval

Reporter

current_experiment

current_logger

current_span

flush

get_prompt_versions

get_span_parent_object

init

init_dataset

init_experiment

init_function

init_logger

invoke

load_prompt

log

login

parent_context

parse_stream

patch_litellm

permalink

prettify_params

serialize_response_format

set_http_adapter

set_masking_function

set_thread_pool_max_workers

span_components_to_object_id

start_span

summarize

traced

update_span

wrap_anthropic

wrap_litellm

wrap_openai

Classes

AsyncResponseWrapper

AsyncScorerLike

Attachment

BaseExperiment

BraintrustConsoleChunk

BraintrustErrorChunk

BraintrustInvokeError

BraintrustJsonChunk

BraintrustProgressChunk

BraintrustStream

BraintrustTextChunk

CodeFunction

CodePrompt

CompletionWrapper

DataSummary

Dataset

DatasetSummary

EmbeddingWrapper

EvalCase

EvalHooks

EvalResult

EvalScorerArgs

Evaluator

Experiment

ExperimentSummary

ExternalAttachment

JSONAttachment

LiteLLMWrapper

MetricSummary

ModerationWrapper

NamedWrapper

Project

ProjectBuilder

Prompt

PromptBuilder

ReadonlyAttachment

ReadonlyExperiment

RepoInfo

ReporterDef

ResponsesWrapper

RetryRequestExceptionsAdapter

SSEProgressEvent

ScoreSummary