Skip to main content

Documentation Index

Fetch the complete documentation index at: https://braintrust.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

Braintrust-hosted customers can export to AWS S3. Self-hosted customers can export to the cloud that hosts their data plane: AWS S3 or Google Cloud Storage (cross-cloud export is not supported).
Data plane v2.1 introduces Google Cloud Storage as an export destination (coming soon for self-hosted), a new file layout, and other changes that may affect existing integrations. See Migrate exports to data plane v2.1.
only available on the Enterprise plan.

How exports work

An export is a project-scoped automation that runs on a configured interval. The interval is a target cadence between runs (start-to-start), not a fixed wall-clock schedule. Choose an interval based on your data volume and optimal file sizes for downstream query performance. On each run, the export:
  • Queries recent log data.
  • Writes the result to your cloud storage bucket as either JSON Lines or Parquet, organized in a Hive-partitioned layout that warehouses and query engines can read.
  • Advances its cursor, so that the same data isn’t re-exported on the next run.
  • Schedules the next run: immediately if more data remains, otherwise roughly one interval from when the current run finished.
For one-off exports, or to export data other than logs, use SQL or the REST API instead of exports.

Create an export

Configure exports per project, in the UI or via the API.
Rewind and start-from date can only be configured through the UI.
To create an export, follow these steps:
  1. Go to Settings > Data management.
  2. Click + Create automation.
  3. Configure export settings:
    • Automation name: Identify the export.
    • Type: Export
    • Data to export: Logs (traces), Logs (spans), or Custom SQL query. See export data types.
    • Export path: Target bucket and prefix (e.g., s3://my-bucket/braintrust/logs or gs://my-bucket/braintrust/logs). Once the automation is created, this path cannot be changed.
    • Cloud auth: On AWS data planes, provide a Role ARN. On GCP data planes, provide a Service account email. See identity and access management.
    • Format: JSON Lines or Parquet.
    • Interval: How often to export (5 min to 1 day).
    • (Optional) Start from (requires data plane v2.1+, not yet available for self-hosted customers): Export data starting from this time. The picker uses your browser’s local timezone, but date= folders partition by UTC. Leave empty to export from the beginning of your data. Logs already deleted by a retention policy will not be included.
  4. Expand Setup instructions in the dialog to see the IAM steps for your data plane’s cloud.
  5. Click Test automation to verify access. The test writes a small file to your export path and immediately deletes it.
  6. Click Create automation.
The first export interval starts immediately.
On data plane v2.0 and earlier, some labels differ (S3 export, S3 path, Role creation instructions) and the Start from field isn’t available. The setup flow is otherwise the same.

Export data types

When creating an export automation, you can choose to export trace summaries, spans, or data that matches a custom SQL query.
Exports one row per trace, with scores and metrics aggregated across every span under the root. Child spans are not individually represented in the output. See the SQL summary shape.Useful for analytics, cost dashboards, or warehouse joins.
Exports one row per span, with root and child spans both present. See the SQL spans shape.Useful for data science, fine-tuning, or data archival.
Exports data according to a custom SQL query. Useful for consumers that need a filtered subset of your data.Custom queries must use project_logs(...) as the data source. Datasets, experiments, and subqueries aren’t supported. GROUP BY and aggregates are also rejected. The exporter manages sort, limit, and cursor internally, so don’t set those in your query.Custom queries can also return all spans from any trace that contains at least one matching span. This can be useful for retrieving full context around failures, instead of just the failing spans in isolation. To do this, use SQL’s traces shape.Example export queries:
  • To export every LLM span from your project logs:
    SELECT id, span_id, root_span_id, input, output, metadata, created
    FROM project_logs('your-project-id', shape => 'spans')
    WHERE span_attributes.type = 'llm'
    
  • To export every span from any trace that contains at least one low-scoring span:
    SELECT id, span_id, root_span_id, input, output, scores, metadata, created
    FROM project_logs('your-project-id', shape => 'traces')
    WHERE scores.correctness < 0.5
    

Identity and access management (IAM)

Braintrust authenticates to your bucket using credentials you create: an IAM role on AWS, or a service account on GCP. In the export configuration dialog, expand Setup instructions and follow the guided steps for your data plane’s cloud.
Google Cloud Storage export is available on self-hosted GCP deployments running data plane v2.1+ (not yet available for self-hosted customers). Setup requires a Terraform change to your data plane module to authorize service account impersonation.

Cloud folder structure

Files are organized using Hive partitioning, a standard layout supported by many warehouses and query engines (see docs for DuckDB and BigQuery). The partition key is date, set to the UTC calendar date the row was logged (not the date the export ran). A single run can write into multiple folders when catching up on old data.
{prefix}/
  date=2026-04-19/
    c1f1e5a8-....jsonl.gz
    9de8b4a0-....jsonl.gz
  date=2026-04-20/
    42d0fe3c-....jsonl.gz
Exported files have the following characteristics:
  • Row order - Rows within a file are sorted ascending by _xact_id, an internal ID assigned at write time. Rows committed together share an _xact_id, and newer writes get higher values. For Logs (spans) exports, this does not imply trace grouping or parent-before-child ordering. To get trace-grouped output, sort in the consuming application.
  • File splits - Files split when the date changes or when the file reaches an internal size target. All rows for a given _xact_id always land in the same file, so the exporter writes past the size target when needed to keep them together. File names are random UUIDs to avoid collisions.
  • Formats - JSON Lines files are gzipped (.jsonl.gz). Parquet files are ZSTD-compressed (.parquet).
On data plane v2.0 and earlier, files use the layout {prefix}/YYYY-MM-DD/{file} (no date= prefix), and the folder date is the date the export ran rather than the row’s transaction date.

Export status and history

Monitor exports through the Export status dialog. The export doesn’t emit notifications, so check here to confirm runs succeeded and surface errors. Include the automation ID and any error text when contacting support.
  1. Go to Settings > Data management.
  2. To open the Export status dialog, click the status icon next to your export.
  3. Review run history, row counts, bytes written, and duration. These values are cumulative since the export was created or last rewound. Byte counts represent compressed, on-disk sizes in the storage bucket.

Rewind to a specific date

Rewind moves an existing export’s cursor backward to a chosen moment, so subsequent runs re-emit everything from that moment forward. Use it to re-export a window after a downstream bug, or to refill your bucket after deleting its contents. To rewind an existing export, follow these steps:
  1. Open the Export status dialog and click Rewind….
  2. In the rewind panel, pick a date and time. Note that date=YYYY-MM-DD/ folders in cloud storage are UTC-based, so they may not match the local time you entered.
  3. Click Rewind to selected time to rewind to that moment, or Re-export all data to rewind to the beginning.
A few things to expect when you rewind:
  • The next run starts immediately, and then the export cadence resumes (with the next run scheduled one interval later).
  • Exports are incremental. Runs happen back-to-back while there’s a backlog, with each run continuing from where the previous one stopped. The normal interval cadence resumes once caught up.
  • Existing files are kept. Rewind doesn’t delete, modify, or replace files already in your bucket. Subsequent runs write fresh UUID-named files alongside the old ones with overlapping rows, so deduplicate downstream on id or span_id, not on filename.
  • Deleted logs can’t be re-exported. Rewinding past your retention cutoff has no effect on logs your retention policy already removed.
  • Totals reset. The All runs values in the Export status dialog (rows, data, duration) reset to zero after a rewind and rebuild as new runs land.

Troubleshooting

Query timeouts: For trace exports timing out:
  • Ensure you’re on data plane v1.1.27 or later.
  • In the Export status dialog, click Rewind… and choose an earlier point in time to re-process from.
  • If problems persist, create a new trace export automation.
AWS IAM errors: If test automation fails with permission errors:
  • Verify the IAM role ARN is correct.
  • Check the trust policy includes the correct external ID.
  • Ensure the S3 policy grants the required permissions on the target bucket and prefix.
  • Confirm the bucket and prefix exist.
GCP IAM errors: If test automation fails with a 403 on generateAccessToken:
  • Verify that the service account email pasted into the UI matches the one you created.
  • Confirm the service account is listed in brainstore_impersonation_targets in your Braintrust data plane Terraform module, and that Terraform was applied after adding it. See Service account impersonation.
  • Confirm the Brainstore service account has the Service Account Token Creator role on your export service account. Braintrust’s Terraform module sets this automatically; if you configured IAM manually, verify it under the export service account’s Principals with access tab.
  • Confirm your service account has roles/storage.objectAdmin on the target bucket.
  • Allow a few minutes for IAM changes to propagate.

Migrate exports to data plane v2.1

For Braintrust-hosted customers, the changes below are already in effect. Data plane v2.1 has not yet been released for self-hosted deployments.
For self-hosted customers, the changes below take effect once you upgrade to data plane v2.1+ (not yet released) and enable cloud storage export in your Terraform module or Helm chart. Some of these changes may affect existing downstream pipelines or scripts that programmatically create exports. Changes in v2.1:
  • Google Cloud Storage as an export destination (self-hosted GCP only).
  • Specify a start-from date when creating an export.
  • Rewind lets you re-export historical data. After a rewind, the same row can appear in your bucket more than once; deduplicate downstream on id, keeping the row with the highest _xact_id.
  • Dataset and experiment export is no longer supported. To export an experiment or dataset, use SQL or the REST API instead.
  • File paths use a Hive-partitioned layout: {prefix}/date=YYYY-MM-DD/ instead of {prefix}/YYYY-MM-DD/. The date is now the row’s transaction date, not the date the export ran. Old-layout files stay in place; scripts matching the old shape will miss new files.
  • Files split by size rather than by a 100,000-row cap. Historical catch-up spreads across many date= partitions instead of piling into the current run’s folder, so consumers watching only today’s folder will miss backfill output.
  • API change for programmatic creation. The second registration call is now POST /brainstore/automation/reset-cursors instead of POST /automation/cron. Scripts that create exports via the API need to be updated.

Limitations

  • For self-hosted deployments, Rewind, start-from date, and Google Cloud Storage destinations require data plane v2.1+ (not yet available for self-hosted customers) and must be explicitly enabled through your Terraform module or Helm chart. See Migrate exports to data plane v2.1.
  • Only log data can be exported.
  • Braintrust-hosted customers can only export their data to AWS S3.
  • Self-hosted customers can only export data to the cloud associated with their data plane (AWS S3, and Google Cloud Storage once data plane v2.1 is available for self-hosted customers). Azure Blob Storage is not supported.
  • Braintrust’s export writer does not use VPC endpoints or Private Service Connect. Traffic uses the cloud’s public object-store endpoints.
On data plane v2.0 and earlier, datasets and experiments could also be exported.

Next steps