Discover insights with Topics

Topics automatically analyze and classify your logs, discovering patterns and insights across all your traces. Topics help you:

Surface user intents: Discover what users are trying to accomplish.
Identify friction patterns: Find where users struggle, get confused, or encounter errors.
Track sentiment trends: Track user satisfaction and frustration across interactions.
Analyze performance: Understand how your AI responds, where it fails, and what issues recur.

Topics is in beta for Pro and Enterprise plans. Free plan users can select Request access to join the beta.

How it works

You create topic maps, automated analysis workflows that extract summaries from your traces and cluster them into named topics. Braintrust provides three built-in topic maps:

The Task topic map analyzes user intents or goals.
The Sentiment topic map analyzes emotional tone.
The Issues topic map identifies problems with agent behavior or responses.

Each topic map processes your traces through a multi-stage pipeline:

Preprocessing

A preprocessor (a JavaScript function) transforms the raw trace data into text that AI models can analyze. The built-in thread preprocessor takes your conversation messages, tool calls, and nested spans, and formats them into a readable narrative. For example, a complex trace with multiple LLM calls and tool invocations becomes: “User asked about creating datasets. Assistant explained the process and called the create_dataset tool…”

Summary extraction

An AI prompt (called a facet) analyzes the preprocessed text and extracts a concise summary. The built-in facets include Task (extracts user intents like “Creating a dataset”), Sentiment (extracts emotional tone like “POSITIVE” or “FRUSTRATED”), and Issues (identifies problems like “API timeout error”). Each facet runs on sampled traces according to your sampling rate, building up a collection of summaries over time.

Topic generation

Once enough summaries are collected (typically 100+), you generate topics by clustering similar summaries together. A clustering algorithm groups summaries like “Requested refund,” “Frustrated with checkout,” and “Coupon code issues” into a single cluster, which then gets named by an AI model. You can review these generated topics and save them.

Classification

When you save a topic map, you can choose to automatically classify new and existing traces with the topics in the map. In this case, the system determines which saved topic best matches the extracted summary and adds a classification label to the trace (e.g., classifications.Task.label = "Dataset creation"). These classifications appear in your logs table and can be queried with SQL.

The entire pipeline runs automatically based on your sampling rate and can process both new and historical logs. You can also create custom topic maps to analyze your traces from different angles.

Create your first topic maps

Topics requires at least 400 traces in your project to get started.

In your project, go to Topics.
Select which built-in topic maps to create:
- Task: Categorize user intents and goals
- Sentiment: Classify emotional tone (POSITIVE, NEGATIVE, NEUTRAL)
- Issues: Identify agent problems and errors
You can create all three or choose specific ones based on your needs.
(Optional) Expand Settings to customize preprocessing, sampling rate, or idle timeout.
(Optional) Use Test on a sample log to verify extraction quality before enabling.
Click Create topic maps.

Braintrust will begin processing traces and extracting summaries at the configured sampling rate (100% is default).

Generate topics

Once a topic map has extracted enough trace summaries, you can generate topics by clustering similar summaries together.

You need at least 100 trace summaries to generate topics.

Select a topic map from the sidebar. The UI shows how many traces from the last 3 days have been summarized for the topic map.
(Optional) Click to customize the clustering algorithm, clustering size, and other advanced settings. Most users can use the defaults.
Click Generate topics. Braintrust clusters similar summaries and displays the results with auto-generated topic names and descriptions.
Review the results to ensure they accurately reflect your data patterns.
- Open the topics visualization to see how summaries are grouped.
- Review sample summaries in each topic to verify they’re semantically similar.
- Look for topics with unclear or overlapping names.
- Check how many traces don’t match any topic (shown in the “Noise/Unclustered” section).
If the generated topics don’t meet your needs, see Refine your topics.
Click Save topics > Save.
- To automatically classify new traces with these topics, enable Classify incoming traces with these topics before saving.
- To apply classifications to existing traces, enable Process existing traces.
Classifications only appear on traces when the corresponding toggle is enabled before saving.

Each generation creates a new version, so you can iterate and improve your categorization over time.

Refine your topics

If generated topics don’t meet your needs, you have two options:

Regenerate with different settings
- Expand Settings under Configuration and adjust the time range or sample rate for processing existing traces.
- Click Generate > Advanced clustering options and adjust the minimum cluster size (increase for broader topics, decrease for more granular topics), try different clustering algorithms, or change other advanced options.
Click Generate topics again to see new results. Each generation is saved as a version, so you can experiment freely.
Create a custom topic map Create a custom topic map when the built-in ones (Task, Sentiment, Issues) don’t capture the patterns you need. See Create custom topic maps for more details.

Use your topics

Visualize topic distributions

View all your topic maps and their distributions in one place.

Go to Logs.
Select Display > Layout > Topics.

Alternatively, from the Topics page, click Go to logs topic view to access this layout. The Topics layout displays each topic map as a card showing its topics with percentages and trace counts.

To filter the logs table to traces in a specific topic, click on the topic card.

Build datasets from topics

Extract logs matching specific topics to create focused evaluation datasets.

Go to Logs and click Filter.
Select Classifications and choose the classification you want to filter by. Alternately, click SQL and enter a SQL filter clause. See the SQL reference for more query patterns.
classifications.Task.label = "Dataset creation"
Review the filtered logs to ensure they match your criteria.
Select the logs you want to include.
Click + Dataset.
Choose an existing dataset or create a new one.

Common use cases:

Create a dataset of “Error Investigation” tasks to test your error handling
Build a dataset of negative sentiment interactions to improve responses
Extract “Pricing Questions” to evaluate your pricing explanations

See Build datasets for more on working with datasets.

Score logs based on topics

Use topic classifications to automatically score logs and identify issues. You can create scorers that:

Flag all logs with negative sentiment for review
Score logs with specific issue types lower
Trigger alerts when certain topics appear together

Example scorer that flags negative checkout experiences:

import braintrust from "braintrust";
import { z } from "zod";

const project = braintrust.projects.create({ name: "my-project" });

project.scorers.create({
  name: "Checkout experience",
  slug: "checkout-experience",
  description: "Flag traces with negative checkout experiences",
  parameters: z.object({
    trace: z.any(),
  }),
  handler: async ({ trace }) => {
    if (!trace) return { score: null };

    // Get all spans from the trace
    const spans = await trace.getSpans();

    // Find the root span which contains classifications
    const rootSpan = spans.find(
      (s) => s.span_id === s.root_span_id
    );

    if (!rootSpan) return { score: null };

    // Access classifications from the root span
    const classifications = rootSpan.classifications || {};
    const taskClassification = (classifications.Task || [{}])[0];
    const sentimentClassification = (classifications.Sentiment || [{}])[0];

    // Check if this is a negative checkout experience
    if (
      taskClassification.label === "Checkout Flow" &&
      sentimentClassification.label === "NEGATIVE"
    ) {
      return {
        score: 0,
        metadata: { reason: "Negative sentiment during checkout" },
      };
    }

    return { score: 1 };
  },
});

To use this scorer:

Save the code to a file (topic_scorer.ts or topic_scorer.py).
Push it to Braintrust:
npx braintrust push topic_scorer.ts
Go to Settings > Project > Automations and click + Rule.
Select your “Checkout experience” scorer.
Set Scope to Trace (required for accessing classifications).
Configure your sampling rate and optional SQL filter
Click Create rule.

See Score online and Trace-level scorers for more details.

Assign topics for review

Assign logs matching specific topics for human review.

Go to Logs and click Filter.
Select Classifications and choose the classification you want to filter by. Alternately, click SQL and enter a SQL filter clause. See the SQL reference for more query patterns.
classifications.Task.label = "Dataset creation"
Review the filtered logs to ensure they match your criteria.
Select the logs you want to include.
Select Assign.
Choose a member to assign.

Team members receive email notifications when rows are assigned to them.

See Add human feedback for more on human review.

Version topic maps

Every time you generate topics and save, a new version of the topic map is created automatically. This lets you compare topics across topic map versions.

To automatically classify new traces with this new version, enable Classify incoming traces with these topics before saving.
To classify existing traces with this new version, enable Process existing traces. If traces had classifications from previous version, the classifications will be overwritten.

Create custom topic maps

Create a custom topic map when the built-in ones (Task, Sentiment, Issues) don’t capture the patterns you need:

Domain-specific categories: Your logs have patterns that built-in topic maps don’t capture.
Too many uncategorized traces: The built-in topic maps aren’t extracting relevant summaries.
Wrong level of detail: You need more specific categorization (e.g., distinguish between different API endpoints instead of just “API request”).
Business-specific needs: Track patterns unique to your product (e.g., “Feature requests,” “Pricing questions,” “Integration issues”).

For example, if you’re trying to assess customer churn risk, create a custom “Churn Risk” topic map that analyzes conversation summaries to determine whether a user is at low, medium, high, or critical risk of churning, based on their satisfaction, language, and outcome.

Go to Topics and click + Topic map.
Give your topic map a name and description. Example:
- “Churn risk”
- “Topic map for assessing customer churn risk based on conversations”
Select Facet > + Custom facet.
Enter a descriptive name for the facet.

Choose a preprocessor to transform your trace data.

Select Preprocessor > thread (default) to format traces as conversation threads.

Or select + Custom preprocessor to write a JavaScript function that filters or transforms your data. Common patterns:

// Extract only user messages from a conversation
function userMessagesOnly(span) {
  const messages = span.input?.messages || [];
  const userMessages = messages
    .filter(m => m.role === 'user')
    .map(m => m.content)
    .join('\n\n');

  return userMessages || '(No user messages found)';
}

Select an AI model for summary extraction.

Enter a prompt with clear instructions for what to extract. Example:

Based on this conversation, assess the churn risk for this customer.

Consider:

- Frustration level and language used (complaints, strong negative words)
- Whether their issue was resolved satisfactorily
- Mentions of competitors, alternatives, or cancellation
- Overall satisfaction signals (thanks, happy, vs angry, disappointed)
- Severity and recurrence of issues

Classify as:
- LOW RISK: Satisfied customer, issue resolved, positive interaction
- MEDIUM RISK: Some frustration but issue handled, no major red flags
- HIGH RISK: Frustrated customer, unresolved issues, or mentions of dissatisfaction
- CRITICAL: Explicitly mentioned canceling, switching to competitor, or very angry

Respond with the label followed by a colon and the key risk indicators (one sentence).

Examples:
- "LOW_RISK: User thanked the agent and confirmed their billing question was answered."
- "HIGH_RISK: User expressed frustration about repeated API errors and said this is unacceptable."
- "CRITICAL: User stated they are considering switching to a competitor if issues persist."

Click Test to verify extraction quality on sample traces.
Click Create.
Click Create topic map to save your custom topic map.

Once created, your custom topic map will begin processing traces. Follow the Generate topics steps to extract summaries, cluster them, and save your topics.

Troubleshooting

Can't generate topics

Insufficient summaries: Topic generation requires at least 100 unique summaries from the selected time range.

Select a longer time range.
Increase sampling rate to process more traces.

Poor topic quality

Too many uncategorized summaries: If >20% of summaries are uncategorized, your topics may be too specific.

Regenerate with larger minimum cluster size (e.g., 10-15).
Try a broader time range to capture more pattern variation.

Topics too generic: If topics are too broad (e.g., all “User questions”), increase granularity.

Regenerate with smaller minimum cluster size (e.g., 3-5).
Refine your facet prompt to extract more specific summaries.

Summaries too similar: If the facet extracts nearly identical summaries, clustering can’t differentiate.

Review sample summaries in the topic map.
Refine your facet prompt to capture more distinguishing details.

Missing classifications on logs

Classification not enabled: You enable classification on existing and new traces when saving a topic map.

Select your topic map, click Save topics.
Enable Classify incoming traces with these topics to automatically classify new traces.
Enable Process existing traces to apply classifications to existing traces.

Next steps

View classifications in your logs table and trace details
Query with SQL for advanced analysis
Monitor trends with dashboards and charts
Create evaluation datasets from specific topics
Set up online scoring based on classifications

Start

Instrument

Observe

Annotate

Evaluate

Deploy

Admin

Best practices

Discover insights with Topics

How it works

Create your first topic maps

Generate topics

Refine your topics

Use your topics

Visualize topic distributions

Build datasets from topics

Score logs based on topics

Assign topics for review

Version topic maps

Create custom topic maps

Troubleshooting

Next steps

Start

Instrument

Observe

Annotate

Evaluate

Deploy

Admin

Best practices

​How it works

​Create your first topic maps

​Generate topics

​Refine your topics

​Use your topics

​Visualize topic distributions

​Build datasets from topics

​Score logs based on topics

​Assign topics for review

​Version topic maps

​Create custom topic maps

​Troubleshooting

​Next steps

How it works

Create your first topic maps

Generate topics

Refine your topics

Use your topics

Visualize topic distributions

Build datasets from topics

Score logs based on topics

Assign topics for review

Version topic maps

Create custom topic maps

Troubleshooting

Next steps