Annotate your code
You can add traces for multiple, specific functions in your code to your logs by annotating them with functional wrappers (TypeScript) or decorators and context managers (Python):Add names to traces
You can add custom names to traces using thename parameter. If you don’t provide a name, the SDK will use the function name or “anonymous” if the function is unnamed.
Wrap LLM clients
Wrap OpenAI
For information about how to wrap OpenAI clients, check out the OpenAI provider docs.Wrap Anthropic
For information about how to wrap Anthropic clients, check out the Anthropic provider docs.Wrap Gemini
For information about how to wrap Google Gemini clients, check out the Gemini provider docs.Wrap a custom LLM client
If you’re using your own client, you can wrap it yourself using the same conventions as the OpenAI wrapper. Check out the Python and TypeScript implementations for reference. To track the span as an LLM, include the following in your wrapper:- Specify the
typeasllm. You can specify anynameyou’d like. This enables LLM duration metrics. - Add
prompt_tokens,completion_tokens, andtokensto themetricsfield. This enables LLM token usage metrics. - To track cached tokens, log
prompt_cached_tokens(cache reads) andprompt_cache_creation_tokens(cache writes) tometrics. By convention,prompt_tokensshould include bothprompt_cached_tokensandprompt_cache_creation_tokens. So if you have a request with 10 cache read tokens, 5 cache write tokens, and 3 uncached tokens, you should logprompt_tokens: 18. - Format the
inputas a list of messages (using the OpenAI format), and put other parameters (likemodel) inmetadata. This enables the Try prompt button in the UI.
Trace multimodal content
Upload attachments
In addition to text and structured data, Braintrust also supports uploading file attachments (blobs). This is especially useful when working with multimodal models, which can require logging large image, audio, or video files. You can also use attachments to log other unstructured data related to your LLM usage, such as a user-provided PDF file that your application later transforms into an LLM input. To upload an attachment, create a newAttachment object to represent the file
on disk or binary data in memory to be uploaded. You can place Attachment
objects anywhere in the event to be logged, including in arrays/lists or deeply
nested in objects. See the TypeScript or Python SDK
reference for usage details.

Use external files as attachments
The
ExternalAttachment feature is supported only in self-hosted deployments. It is not supported in Braintrust-hosted environments.ExternalAttachment object. You can use this anywhere you would use an
Attachment. See the Attachments guide for more
information.
Upload large traces
Braintrust has a 6MB limit on individual logging upload requests. However, you may need to log larger data structures, such as lengthy conversation transcripts, extensive document sets, or complex nested objects. TheJSONAttachment allows you to upload JSON data inline, and it will automatically
get converted to an Attachment behind the scenes.
When you use JSONAttachment, your JSON data is:
- Uploaded separately as an attachment, bypassing the 6MB trace limit
- Not indexed, which saves storage space and speeds up ingestion, but not available for search or filtering
- Still fully viewable in the UI with all the features of the JSON viewer (collapsible nodes, syntax highlighting, etc.)
Link to external images
To log an external image, provide an image URL, an external object store URL, or a base64 encoded image as a string. The tree viewer will automatically render the image.

Trace errors
When you run:- Python code inside of the
@traceddecorator or within astart_span()context - TypeScript code inside of
traced(or awrappedTracedfunction)

error field which you can also log to directly.
Trace deeply nested code
Often, you want to trace functions that are deep in the call stack, without having to propagate thespan object throughout. Braintrust uses async-friendly
context variables to make this workflow easy:
- The
tracedfunction/decorator will create a span underneath the currently-active span. - The
currentSpan()/current_span()method returns the currently active span, in case you need to do additional logging.
Mask sensitive data
You can configure a global masking function to redact sensitive information before it’s sent to Braintrust. The masking function is applied to theinput, output, expected, metadata, and context fields of each logged event.
Mask API keys and passwords
This example shows how to mask common sensitive fields like API keys, passwords, and tokens in your data:Mask personally identifiable information (PII)
This example demonstrates masking PII such as email addresses, phone numbers, and social security numbers:Enable custom masking for specific data structures
This example shows how to handle custom data structures and implement selective masking based on context:Considerations
- The masking function is applied globally and affects all logging across your application
- Masking is applied after events are merged but before they are sent to Braintrust
- The masking function can modify data in place, which is often more performant
- Only the fields
input,output,expected,metadata, andcontextare passed to the masking function - Always test your masking function thoroughly to ensure sensitive data is properly redacted
- In the event of an error, the data will be masked with a generic message like
ERROR: Failed to mask fieldto avoid leaking sensitive information
Trace distributed code
Sometimes it’s useful to be able to start a trace in one process and continue it in a different one. For this purpose, Braintrust provides anexport function
which returns an opaque string identifier. This identifier can be passed to
start_span to resume the trace elsewhere. Consider the following example of
tracing across separate client and server processes.
Trace client code
Trace server code
Update spans
Similar to distributed tracing, it can be useful to update spans after you initially log them. For example, if you collect the output of a span asynchronously. TheExperiment and Logger classes each have an updateSpan() method, which you can call with
the span’s id to perform an update.
span.export() to export the span in a fully contained string, which is useful if you
have multiple loggers or perform the update from a different service.
Deep link to spans
TheSpan.permalink method formats a permalink to the Braintrust application
for viewing the span. The link will open the UI to the row represented by the
Span object.
If you do not have access to the original Span object, the slug produced by
Span.export contains enough information to produce the same permalink. The
braintrust.permalink function can be used to construct a deep link to the row
in the UI from a given span slug.
Manually manage spans
In more complicated environments, it may not always be possible to wrap the entire duration of a span within a single block of code. In such cases, you can always pass spans around manually. Consider this hypothetical server handler, which logs to a span incrementally over several distinct callbacks:Import and export spans
Spans are processed in Braintrust as a simple format, consisting ofinput, output, expected, metadata, scores,
and metrics fields (all optional), as well as a few system-defined fields which you usually do not need to mess with, but
are described below for completeness. This simple format makes
it easy to import spans captured in other systems (e.g. languages other than TypeScript/Python), or to export spans from
Braintrust to consume in other systems.
Underlying format
The underlying span format contains a number of fields which are not exposed directly through the SDK, but are useful to understand when importing and exporting spans.idis a unique identifier for the span, within the container (e.g. an experiment, or logs for a project). You can technically set this field yourself (to overwrite a span), but it is recommended to let Braintrust generate it automatically.input,output,expected,scores,metadata, andmetricsare optional fields which describe the span and are exposed in the Braintrust UI. When you use the TypeScript or Python SDK, these fields are validated for you (e.g. scores must be a mapping from strings to numbers between 0 and 1).span_attributescontains attributes about the span. Currently the recognized attributes arename, which is used to display the span name in the UI, andtype, which displays a helpful icon.typeshould be one of"llm","score","function","eval","task", or"tool".- Depending on the container, e.g. an experiment, or project logs, or a dataset, fields like
project_id,experiment_id,dataset_id, andlog_idare set automatically, by the SDK, so the span can be later retrieved by the UI and API. You should not set these fields yourself. span_id,root_span_id, andspan_parentsare used to construct the span tree and are automatically set by Braintrust. You should not set these fields yourself, but rather let the SDK create and manage them (even if importing from another system).
input, output, expected, scores, metadata, and metrics.
You can use the SDK to populate the remaining fields, which the next section covers with an example.
Here is an example of a span in the underlying format:
Example import/export
The following example walks through how to generate spans in one program and then import them to Braintrust in a script. You can use this pattern to support tracing or running experiments in environments that use programming languages other than TypeScript/Python (e.g. Kotlin, Java, Go, Ruby, Rust, C++), or codebases that cannot integrate the Braintrust SDK directly.Generate spans
The following example runs a simple LLM app and collects logging information at each stage of the process, without using the Braintrust SDK. This could be implemented in any programming language, and you certainly do not need to collect or process information this way. All that matters is that your program generates a useful format that you can later parse and use to import the spans using the SDK.Import spans
The following program uses the Braintrust SDK in Python to import the spans generated by the previous script. Again, you can modify this program to fit the needs of your environment, e.g. to import spans from a different source or format.Run traced functions in a ThreadPoolExecutor
The Python SDK uses context variables to hold the span state for traces.
This means that if you run a traced function inside of a concurrent.futures.ThreadPoolExecutor,
the span state will be lost.
Instead, you can use the TracedThreadPoolExecutor class provided by the Braintrust SDK.
This class is a thin extension of concurrent.futures.ThreadPoolExecutor
that captures and passes context variables to its workers.
Tune parameters
The SDK includes several tuning knobs that may prove useful for debugging.BRAINTRUST_SYNC_FLUSH: By default, the SDKs will log to the backend API in the background, asynchronously. Logging is automatically batched and retried upon encountering network errors. If you wish to have fine-grained control over when logs are flushed to the backend, you may setBRAINTRUST_SYNC_FLUSH=1. When true, flushing will only occur when you runExperiment.flush(or any of the other object flush methods). If the flush fails, the SDK will raise an exception which you can handle.BRAINTRUST_MAX_REQUEST_SIZE: The SDK logger batches requests to save on network roundtrips. The batch size is tuned for the AWS lambda gateway, but you may adjust this if your backend has a different max payload requirement.BRAINTRUST_DEFAULT_BATCH_SIZE: The maximum number of individual log messages that are sent to the network in one payload.BRAINTRUST_NUM_RETRIES: The number of times the logger will attempt to retry network requests before failing.BRAINTRUST_QUEUE_SIZE(Python only): The maximum number of elements in the logging queue. It must be greater than zero. This value limits the memory usage of the logger. Logging additional elements beyond this size will drop the oldest elements in the queue, as of v0.1.5. In v0.1.4 and earlier, you can choose to drop or block the calling thread with theBRAINTRUST_QUEUE_DROP_WHEN_FULLenv variable.BRAINTRUST_QUEUE_DROP_EXCEEDING_MAXSIZE(Javascript only): Essentially a combination ofBRAINTRUST_QUEUE_SIZEandBRAINTRUST_QUEUE_DROP_WHEN_FULL, which changes the behavior of the queue from storing an unlimited number of elements to capping out at the specified value. Additional elements are discarded.BRAINTRUST_FAILED_PUBLISH_PAYLOADS_DIR: Sometimes errors occur when writing records to the backend. To aid in debugging errors, you may set this environment variable to a directory of choice, and Braintrust will save any payloads it failed to publish to this directory.BRAINTRUST_ALL_PUBLISH_PAYLOADS_DIR: Analogous toBRAINTRUST_FAILED_PUBLISH_PAYLOADS_DIR, except that Braintrust will save all payloads to this directory.