Advanced topics related to self-hosting

This guide covers advanced topics related to self-hosting.

Data plane vs. Control plane

Braintrust's architecture has two main components: the data plane and the control plane. The data plane is the component that handles the actual data, while the control plane is the component that serves the UI along with metadata.

API vs. Full configuration

Braintrust offers two modes for self-hosting: API and Full. In API mode, you host the data plane (API) in your own environment, while the control plane (web app and metadata database) is hosted by Braintrust. In Full mode, you host both the data plane and the control plane in your own environment.

The primary difference between the two options is that the API configuration allows you to host the most sensitive data in your environment, behind an API which changes infrequently (you need to update roughly 1-2 times per month), while allowing the Braintrust team to host the web app and metadata database which are updated multiple times per day. On the other hand, the full configuration allows you to host the webserver and metadata database in your own environment, ensuring total isolation, but with additional maintenance overhead.

To clarify which data is stored in which location, here is a breakdown of the data stored in each place:

Experiment records (input, output, expected, scores, metadata, traces, spans)Your envYour env
Log records (input, output, expected, scores, metadata, traces, spans)Your envYour env
Dataset records (input, output, metadata)Your envYour env
Prompt playground promptsYour envYour env
Prompt playground completionsYour envYour env
Human review scoresYour envYour env
Experiment and dataset namesGlobalYour env
Project namesGlobalYour env
Project settingsGlobalYour env
Git metadata about experimentsGlobalYour env
Organization info (name, settings)GlobalYour env
Login info (name, email, avatar URL)GlobalYour env
Auth credentialsAWS Cognito (SSO, passwords)Your env (password)
API keys (hashed)GlobalYour env
LLM provider secrets (encrypted)GlobalYour env

Customizing the webapp URL

The SDKs guide users to (or the BRAINTRUST_APP_URL variable) to view their experiments. However, in certain advanced configurations, you may want to reverse proxy traffic to the BRAINTRUST_APP_URL from the SDKs while pointing users to a different URL.

To do this, you can set the BRAINTRUST_APP_PUBLIC_URL environment variable to the URL of your webapp. By default, this variable is set to the value of BRAINTRUST_APP_URL, but you can customize it as you wish. This variable is only used to display information, so even its destination does not need to be accessible from the SDK.

Constraining SDK to the data plane

If you're self-hosting the data plane, it may also be advantageous to constrain the SDKs to only communicate with your data plane. Normally, they communicate with the control plane to:

  • Get your data plane's URL
  • Register and retrieve metadata (e.g. about experiments)
  • Print URLs to the webapp

The data plane can proxy the endpoints that the SDKs use to communicate with the control plane, allowing your SDKs to only communicate with the data plane directly. Simply set the BRAINTRUST_APP_URL environment variable to the URL of your data plane and BRAINTRUST_APP_PUBLIC_URL to "" (or the URL of your webapp).

Allow-list URLs

In some cases, you may want to restrict the URLs that the SDKs or API server can communicate with. If so, you should include the following URLs:

Configuring Rate-Limits

By default, the Braintrust API server imposes rate limits against any external domains it reaches out to, such as the BRAINTRUST_APP_URL. The purpose of rate-limiting is to prevent unintentionally overloading any external domains, which may block the API server IP in response.

By default, the rate limit is 100 requests per minute per user auth token. The API server exposes the following variables to configure the rate limits:

  • OUTBOUND_RATE_LIMIT_MAX_REQUESTS: Configure the number of requests per time window. This can be set to 0 to disable rate limiting. In the braintrust CLI, this variable can be set with the --outbound-rate-limit-max-requests flag, or the OutboundRateLimitMaxRequests CloudFormation template parameter.
  • OUTBOUND_RATE_LIMIT_WINDOW_MINUTES: Configure the time window in minutes before the rate limit resets. In the braintrust CLI, this variable can be set with the --outbound-rate-limit-window-minutes flag, or the OutboundRateLimitWindowMinutes CloudFormation template parameter.

On this page