Skip to main content

Documentation Index

Fetch the complete documentation index at: https://braintrust.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

Write correct queries

These practices help you avoid queries that run successfully but return the wrong data or no data.

Choose the right query shape

A query shape controls what each row of the result represents. There are three: spans (default), traces, and summary. Picking the wrong one is a common cause of confusing or slow results. See Data shapes for the full reference.
ShapeEach row representsUse when
spans (default)One spanFiltering or aggregating individual LLM calls, tool calls, or scored spans
tracesOne span (from traces with at least one span that passes the filters)Returning full trace context when any span in the trace matches your filter
summaryPre-aggregated trace metricsRarely needed (see note below)
The same filter behaves differently across shapes. With spans, you get only the spans that match. With traces, you get every span from any trace that contains a matching span.
-- spans shape (default): returns spans with errors
SELECT id, error, span_attributes.type
FROM project_logs('proj-id')
WHERE error IS NOT NULL
  AND created >= now() - interval 7 day

-- traces shape: returns all spans from traces that contain an error
SELECT id, error, span_attributes.type
FROM project_logs('proj-id', shape => 'traces')
WHERE error IS NOT NULL
  AND created >= now() - interval 7 day
The summary shape is rarely needed. estimated_cost(), token metrics, and scores are available on both the spans and traces shapes via metrics.estimated_cost, estimated_cost(), and scores.*. Use summary only when you specifically need the pre-aggregated trace rollup without writing GROUP BY.

Match conditions across different spans

If a filter on the traces shape returns nothing even though you know the data is there, the most likely cause is that your conditions need to match across different spans, not all on a single one. In the traces shape, a WHERE clause matches a trace only if one span satisfies every condition. A single ANY_SPAN(a AND b) behaves the same way. To match conditions across different spans in the same trace, use separate ANY_SPAN() calls.
-- ❌ Returns traces where one span has BOTH an error AND type = 'llm'
-- (returns nothing if errors happen in non-LLM spans, which is typical)
SELECT root_span_id
FROM project_logs('proj-id', shape => 'traces')
WHERE error IS NOT NULL
  AND span_attributes.type = 'llm'

-- ✅ Returns traces where any span has an error AND any (possibly different) span is an LLM call
SELECT root_span_id
FROM project_logs('proj-id', shape => 'traces')
WHERE ANY_SPAN(error IS NOT NULL)
  AND ANY_SPAN(span_attributes.type = 'llm')
See ANY_SPAN() for the full syntax reference.

Aggregate span data across a trace

A common pattern: You want a metric that combines values from different spans in the same trace. For example, suppose your application is multi-tenant. You log each end-customer’s identifier on the trace’s root span as metadata.tenant_id, while token metrics live on child LLM spans. To total tokens per tenant, no single span has both fields, so a single-level aggregation isn’t sufficient. Use two levels of GROUP BY. The inner subquery groups by root_span_id to pair the fields at the trace level. The outer query then aggregates those per-trace rows.
-- Total tokens used by each tenant over the last 7 days
SELECT
  tenant_id,
  sum(trace_tokens) AS total_tokens
FROM (
  SELECT
    root_span_id,
    any_value(metadata.tenant_id)  AS tenant_id,
    sum(metrics.total_tokens)      AS trace_tokens
  FROM project_logs('proj-id')
  WHERE created >= now() - interval 7 day
  GROUP BY root_span_id
) AS per_trace
WHERE tenant_id IS NOT NULL
GROUP BY tenant_id
ORDER BY total_tokens DESC
LIMIT 100
The inner query produces one row per trace:
  • any_value(metadata.tenant_id) picks an arbitrary non-null value from the group. Because metadata.tenant_id is null on every span except the root, any_value() returns the root span’s tenant_id.
  • sum(metrics.total_tokens) totals tokens across every span in the trace. Spans that don’t report token counts contribute null and are ignored by sum(), so the result is the trace’s full token usage.
The inner query has no WHERE span_attributes.type = 'llm' filter on purpose. Filtering to LLM spans would drop the root span and lose tenant_id. Keeping all spans lets any_value() and sum() ignore the null entries on the wrong spans and collect both fields in a single subquery. The outer query then groups by tenant_id and sums the per-trace totals. The WHERE tenant_id IS NOT NULL filter drops traces where the root span didn’t set a tenant_id.

Make queries fast

These practices help your queries scan less data and finish sooner.

Fetch known traces or spans by ID

When fetching a known set of spans or traces, filter on id or root_span_id. These are the most efficient predicates because they map directly to the index.
-- ✅ Fetch specific spans by id
SELECT id, input, output
FROM project_logs('proj-id')
WHERE id IN ('span_a', 'span_b', 'span_c')

-- ✅ Fetch a specific trace by root_span_id
SELECT root_span_id, metrics.duration
FROM project_logs('proj-id', shape => 'traces')
WHERE root_span_id = 'trace_xyz'
Avoid filtering on span_id. It is less efficient than id and does not map directly to the primary index.

Speed up filters on high-cardinality fields

Filtering on fields with many distinct values (high cardinality) can be slow without an index. There are too many distinct values for general scans to skip data efficiently. Common examples are user-defined metadata keys like metadata.user_id, metadata.session_id, or metadata.request_id. Add a subfield index for each field you filter on frequently. See Speed up log filtering for setup instructions. When this doesn’t help: Low-cardinality fields like span_attributes.type or status scan efficiently without an index. Subfield indexes only pay off when the field has many distinct values.

Add a time range filter

When you query project_logs(), add a range filter on created, _xact_id, or _pagination_key, or scope to a specific id or root_span_id. Without one, the query scans your entire project history before any other condition is applied. A predicate like WHERE metadata.session_id = 'x' doesn’t help because it is evaluated after the scan, not before. These patterns still cause a full scan:
-- ❌ No timestamp filter
SELECT id, input, output
FROM project_logs('proj-id')
WHERE metadata.user_id = 'u_42'

-- ❌ Equality on created is not a range filter
SELECT id FROM project_logs('proj-id')
WHERE created = '2024-01-01'

-- ❌ IS NOT NULL is not a range filter
SELECT id FROM project_logs('proj-id')
WHERE created IS NOT NULL

-- ❌ `timestamp` field does not count — only the literal `created` column
SELECT id FROM project_logs('proj-id')
WHERE timestamp >= now() - INTERVAL 1 DAY
These patterns avoid it:
-- ✅ Range on created (most common)
SELECT id, scores.accuracy
FROM project_logs('proj-id')
WHERE created >= now() - INTERVAL 7 DAY

-- ✅ Range on _xact_id or _pagination_key
SELECT id FROM project_logs('proj-id')
WHERE _xact_id >= 12345

-- ✅ Equality on id or root_span_id
SELECT id, input, output
FROM project_logs('proj-id')
WHERE root_span_id = 'trace_abc'
BETWEEN may or may not count as a range filter, depending on how the parser represents it. Use explicit >= or <= to be safe. Only project_logs() requires a range filter. experiment(), dataset(), view(), and other sources do not.

Add a LIMIT

Add LIMIT to any query that could return many rows. Without one, the query returns every matching row, which causes slow responses and large payloads on big datasets. The exception is single-row aggregates. A SELECT with only aggregate functions and no GROUP BY always returns exactly one row, so LIMIT is unnecessary:
-- ✅ Single-row aggregate — LIMIT not needed
SELECT count(*), avg(scores.accuracy)
FROM project_logs('proj-id')
WHERE created >= now() - INTERVAL 1 DAY
For all other queries, add a LIMIT:
SELECT id, input, output
FROM project_logs('proj-id')
WHERE created >= now() - INTERVAL 1 DAY
ORDER BY created DESC
LIMIT 100

Project only the fields you need

Brainstore stores scalar fields like scores, metrics, and span_attributes in fast columnar storage, and large blobs like input, output, and metadata in document storage. Selecting any field outside the columnar fast fields forces the engine to fetch the full raw JSON document for every matching row, which is much slower than reading directly from the columns. Common ways to trigger document fetches:
  • SELECT *.
  • Selecting input, output, or error.
  • Selecting metadata as a whole object.
  • Selecting metadata.* keys that aren’t indexed as fast fields (arbitrary user-defined keys).
If you only need scalar fields, project them directly:
-- ✅ Scalar fast fields only
SELECT id, created, scores.accuracy, metrics.tokens
FROM project_logs('proj-id')
WHERE created >= now() - INTERVAL 1 DAY
LIMIT 100
If you need input or output, narrow the result set first with a cheap columnar query, then fetch full documents for only those rows:
-- First pass: find matching span IDs using only fast fields (cheap)
SELECT id
FROM project_logs('proj-id')
WHERE created >= now() - INTERVAL 1 DAY
  AND scores.accuracy < 0.5
LIMIT 50

-- Second pass: fetch full documents for just those IDs
SELECT id, input, output
FROM project_logs('proj-id')
WHERE id IN ('span_a', 'span_b', ...)

Keep filters indexable

Brainstore can normally apply your WHERE conditions inside the index scan, so only matching rows leave the index. If a condition can’t be applied at the index, the engine retrieves a broader set of rows and filters them afterward. More data moves through memory before being discarded. Common causes:
  • Wrapping an indexed column in a function: WHERE lower(name) = 'foo'.
  • Doing arithmetic on a timestamp column: WHERE created + INTERVAL 1 HOUR > now().
  • Filtering on a field that isn’t indexed (for example, arbitrary metadata.* keys).
-- ❌ Function on indexed column blocks the index
WHERE lower(name) = 'my-span'

-- ✅ Compare the raw column
WHERE name = 'my-span'
-- ❌ Arithmetic on the indexed side
WHERE created + INTERVAL 1 HOUR > now()

-- ✅ Put arithmetic on the constant side
WHERE created > now() - INTERVAL 1 HOUR
When a metadata.* filter is unavoidable, put indexable predicates first to narrow the scan before the metadata filter is applied:
SELECT id, metadata.tenant_id
FROM project_logs('proj-id')
WHERE created >= now() - INTERVAL 1 DAY  -- indexed, applied at scan
  AND status = 'error'                    -- indexed, applied at scan
  AND metadata.tenant_id = 'acme'         -- not indexable, applied after

Sort on indexed columns

When you ORDER BY a raw indexed column, rows come out of the index already in the requested order, and LIMIT can stop the scan early. When you sort on a computed expression, the engine has to fetch rows, project them, and sort them in memory before applying LIMIT. Common causes of slow sorts:
  • Sorting on a computed expression: ORDER BY lower(name), ORDER BY abs(score - 0.5).
  • Projecting fields that force a re-project step between the index and the sort.
-- ❌ Sort on a computed expression
ORDER BY lower(name)

-- ✅ Sort on a raw indexed column
ORDER BY created DESC
A slow sort costs the most without a LIMIT. Always combine ORDER BY with LIMIT. For project_logs() pagination, _xact_id and _pagination_key are the most index-friendly sort keys:
SELECT id, created, scores.accuracy
FROM project_logs('proj-id')
WHERE created >= now() - INTERVAL 1 DAY
ORDER BY _xact_id DESC
LIMIT 100

Aggregate efficiently

GROUP BY queries get expensive at high cardinality. Two things matter most: how many distinct grouping-key values you have, and whether the engine can read those keys directly from indexed fast fields. Narrow the input before grouping. Add a time range filter and any other indexable predicates so the group-by works over a smaller set of rows.
-- ❌ High-cardinality group-by on unbounded input
SELECT user_id, count(*)
FROM project_logs('proj-id')
GROUP BY user_id

-- ✅ Narrow the time range first, then group
SELECT user_id, count(*)
FROM project_logs('proj-id')
WHERE created >= now() - INTERVAL 1 DAY
GROUP BY user_id
Group on raw fast fields, not computed expressions. Brainstore has two aggregation paths. The fast path reads grouping keys directly from indexed fast fields. The slow path walks each document and extracts keys at query time. The engine takes the slow path when grouping on:
  • Computed expressions: GROUP BY lower(name), GROUP BY model || '-' || status.
  • Fields that aren’t fast fields, like arbitrary metadata.* keys.
  • Multi-key GROUP BY where any key is not a fast field.
-- ❌ Computed expression takes the slow path
SELECT lower(name) AS grp, count(*)
FROM project_logs('proj-id')
WHERE created >= now() - INTERVAL 1 DAY
GROUP BY lower(name)

-- ✅ Group on the raw column
SELECT name, count(*)
FROM project_logs('proj-id')
WHERE created >= now() - INTERVAL 1 DAY
GROUP BY name
-- ❌ Arbitrary metadata key is not a fast field
SELECT metadata.tenant_id, count(*)
FROM project_logs('proj-id')
WHERE created >= now() - INTERVAL 1 DAY
GROUP BY metadata.tenant_id

-- ✅ Filter on metadata, group on a fast field
SELECT status, count(*)
FROM project_logs('proj-id')
WHERE created >= now() - INTERVAL 1 DAY
  AND metadata.tenant_id = 'acme'
GROUP BY status
Use two-step aggregation for trace-level metrics. When you want one row per trace (count of traces, average score per trace), use the two-step GROUP BY pattern from Aggregate span data across a trace: aggregate per trace with GROUP BY root_span_id in a spans-shape subquery, then aggregate across traces in the outer query. The inner GROUP BY root_span_id stays on Brainstore’s fast index-based aggregation path. Avoid GROUP BY on the summary shape for the same result. The summary shape pre-aggregates each trace server-side, and stacking another GROUP BY on top forces the engine off the fast path and doesn’t scale at high trace volumes. Cost also depends on the complexity of aggregate expressions. percentile() and count_distinct() cost more than count().

SQL sandbox warnings

The SQL sandbox surfaces these warnings when it detects a pattern from the practices above. Warnings are advisory. The query still runs and returns correct results.
WarningSee section
”This query may scan too much data and run slowly. Add a range filter on created, _xact_id, or _pagination_key, or scope to a specific root_span_id or id.”Add a time range filter
”Query has no LIMIT clause, which may scan and return a large number of rows.”Add a LIMIT
”Query retrieves full span documents, which can be slow.”Project only the fields you need
”Query plan includes a standalone Filter above Index search, so filter pushdown may have failed.”Keep filters indexable
”Query plan includes a standalone Project above Index search, so sort pushdown may have failed.”Sort on indexed columns
”Query plan includes a GroupBy node, which can be expensive at high cardinality.”Aggregate efficiently
”Query plan uses Dynamic segment collector (dynamic group-by), which is typically slower than columnar group-by.”Aggregate efficiently
The SQL sandbox flags two related warnings here. “Query plan includes a GroupBy node, which can be expensive at high cardinality” fires for any GROUP BY clause. “Query plan uses Dynamic segment collector (dynamic group-by), which is typically slower than columnar group-by” fires when the engine takes the slow path. Seeing both together is the strongest signal that you need indexable predicates to narrow the input.