Skip to main content
Applies to:
  • Plan -
  • Deployment -

Summary

Goal: Stitch multi-turn chat into a single Braintrust trace with a clean Thread view using the Vercel AI SDK. Features: traced(), Span.export(), wrapAISDK, session-root span pattern, full message history per turn.

Configuration steps

Step 1: Avoid previous_response_id for tracing

When you use previous_response_id, OpenAI manages conversation history server-side and only sends the current turn’s input to the API. Braintrust traces what it receives per call, so only the current turn appears in the trace. Instead, maintain a messages array in your application and pass the full history to each streamText call.

Step 2: Create a session-root span on the first request

Mint a dedicated root span at the start of the session. Call export() immediately and persist the handle — an httpOnly cookie works well for Next.js. End the span after exporting.
import { traced } from "braintrust";

let sessionRootId: string | undefined;

await traced(async (span) => {
  sessionRootId = await span.export();
}, { name: "chat.session" });

// Persist sessionRootId to an httpOnly cookie
Ending the span before children attach is supported. The SDK resolves parent relationships from the IDs encoded in the export() string (span_id, root_span_id, object_id), not from a live span reference. Children attach correctly regardless of whether the parent span is still open. Cosmetic side-effect: Because the session span ends immediately after export(), it records a near-zero duration in the trace waterfall. This does not affect correctness — the backend assembles the tree from IDs, not time ordering. Workaround for clean timestamps: If your application has a deterministic session end event (explicit close, timeout), you can fix the session span’s end time using updateSpan():
import { updateSpan, braintrust } from "braintrust";

// On the final turn or at session close:
updateSpan({
  exported: sessionRootId, // from the cookie
  metrics: { end: Date.now() / 1000 } // unix timestamp in seconds
});
await braintrust.flush();

Step 3: Attach every turn as a child of the session root

Pass the persisted sessionRootId as the parent option on every subsequent traced() call, including turn 1.
const messages = [];

// Turn 1
await traced(async (span) => {
  messages.push({ role: "user", content: userMessage });

  const result = streamText({
    model: openai("gpt-4.1"),
    messages,
  });

  messages.push({ role: "assistant", content: await result.text });
}, { name: "chat.request", parent: sessionRootId });

// Turn 2 — same pattern, same parent
await traced(async (span) => {
  messages.push({ role: "user", content: followUpMessage });

  const result = streamText({
    model: openai("gpt-4.1"),
    messages,
  });

  messages.push({ role: "assistant", content: await result.text });
}, { name: "chat.request", parent: sessionRootId });
All turns land under one session root. Thread view renders them in order.

Step 4: Keep wrapAISDK — two spans per tool-using turn is expected

With tool calling enabled, streamText makes multiple LLM roundtrips: one to decide on the tool call, one to respond after. This produces two doStream spans per tool-using turn. This is expected behavior, not a duplication bug. Thread view shows each LLM call. Use custom views if you want finer control over which spans are displayed.

Benefits of the session-root pattern

PropertySession-root patternLinear chaining
Session in Logs viewFirst-class row with conversation-level metadataRequires grouping by metadata
Turn structureEvery turn is structurally identicalEach turn references the prior turn
Tree depthConstantGrows with conversation length
Thread viewCorrectCorrect