Skip to main content

Documentation Index

Fetch the complete documentation index at: https://braintrust.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

Applies to:
  • Plan -
  • Deployment -

Summary

Issue: Requests to a model with a large context window return a context_length_exceeded error with a smaller token limit than expected. Cause: Braintrust load balances across all providers that support a matching model name. If there are multiple providers with matching model names, requests can silently use secondary provider that may have a different underlying model with smaller limits. Resolution: Disable/remove secondary providers that don’t have the correct model deployed or have different limits.

Resolution steps

Step 1: Confirm which provider handled the request

Open the failed run’s trace in Braintrust and check:
  • The purple chat completion span (not the root span)
  • metadata.model — confirms which model ID was sent to the provider
If the model ID matches what you expected but the error shows a smaller context limit, the request may have been routed to a provider with a different underlying deployment.

Step 2: Fix provider configuration

Go to Settings > AI providers and do one of the following: Option A — Fix the primary provider: Update the configuration for your providers to match actual limits. Option B — Remove/disable additional providers with matching models: Remove or disable any providers that don’t have the correct model deployed, or that map to older/smaller versions of the model. Option C — Check custom model mappings: If using custom endpoints or Azure deployments, verify that the deployment name actually corresponds to the model you expect.
Note: Braintrust automatically load balances across all providers where a model name matches. Any provider with a matching model name will receive requests, even if the underlying deployment differs.

Step 4: Verify the fix

Re-run your prompt with cache disabled to confirm requests now route to the correct provider and the context length error is resolved.