Applies to:
Summary
Issue: When usingsetup_pydantic_ai(), Braintrust incorrectly marks internal wrapper spans as type “llm”, causing a single API call to appear as 4 separate LLM calls in metrics and dashboards.
Cause: PydanticAI creates nested spans for streaming, agent execution, fallback handling, and the actual API call - all are marked as type “llm” instead of distinguishing wrapper spans from actual API calls.
Resolution: Filter queries to count only actual LLM API calls by model name patterns, excluding internal wrapper spans.