Skip to main content
Applies to:


Summary When using COUNT(DISTINCT field) in a BTQL query with GROUP BY, the result returns the total row count for each group instead of returning 1 per distinct group, causing apparent duplicate user IDs in logs aggregation queries. This is the current expected behavior of how BTQL processes COUNT(DISTINCT) aggregations within GROUP BY clauses. Use count_distinct(field) function instead of COUNT(DISTINCT field), or restructure the query to avoid COUNT(DISTINCT) in GROUP BY contexts.

Resolution Steps

If you need distinct counts per group

Step 1: Replace COUNT(DISTINCT) with count_distinct()

Use the count_distinct() function in your query instead of COUNT(DISTINCT field).
SELECT
  metadata.user,
  count_distinct(metadata.user) as distinct_users,
  COUNT(*) as total_rows
FROM project_logs('PROJECT_ID', shape => 'spans')
WHERE created > NOW() - INTERVAL 20 day
  AND span_attributes.name = 'streamChat'
  AND metadata.user IS NOT NULL
GROUP BY metadata.user

Step 2: Verify results in SQL sandbox

Test the query in the Braintrust SQL sandbox to confirm it returns the expected distinct counts.

If you need time-series distinct counts for monitoring dashboards

Step 1: Group by time period and user

Add date grouping to get distinct user counts distributed by day.
SELECT
  DATE(created) as date,
  metadata.user,
  COUNT(*) as event_count
FROM project_logs('PROJECT_ID', shape => 'spans')
WHERE created > NOW() - INTERVAL 3 day
  AND span_attributes.name = 'streamChat'
  AND metadata.user IS NOT NULL
GROUP BY DATE(created), metadata.user
ORDER BY date DESC, event_count DESC

Step 2: Note monitoring dashboard limitations

The count_distinct() function may not populate data in monitoring dashboards due to current platform limitations.