Loop

Loop is an AI assistant in Braintrust playgrounds. It helps you optimize and generate prompts, datasets and evals.

Loop

Loop is in public beta and is off by default. To turn it on, flip the feature flag in your settings. If you are on a hybrid deployment, Loop is available starting with v0.0.74.

Selecting a model

Loop uses the AI models available in your Braintrust account via the proxy. It will attempt to use Claude 4 Sonnet by default, but supports any model that you have configured in your AI providers, including custom models.

To choose a model, navigate to the gear icon in the Loop chat window and select from the list of available models.

Available tools

Loop currently offers the following functionalities:

Summarize playground: generate a summary of current playground contents
Get eval results: retrieve evaluation results directly within Loop
Edit prompt: generate and modify prompts
Run eval: execute evaluations directly within Loop
Edit data: generate and modify datasets
Continue execution: resume interrupted or paused tasks
Edit scorers: select existing scorers or write new ones

Before suggesting any optimizations, the agent will run and/or summarize your playground to investigate what improvements to suggest. You can remove any of these tools from your Loop workflow by selecting the gear icon and deselecting a tool from the available list.

Coming soon

Fetch logs: access and review logs directly within Loop
Create prompt: create a new prompt
More UI integration: the ability to access Loop outside of playgrounds

Generating and optimizing prompts

Loop can help you generate a prompt from scratch. To do so, make sure you have an empty task open, then use Loop to generate a prompt.

Generate prompt

If you have existing prompts, you can optimize them using Loop.

To optimize a prompt, ask Loop in the chat window, or select the Loop icon in the top bar of any existing task. From there, you can add the prompt to your chat, or quick optimize.

After Loop provides a suggested optimization, you can review and accept the suggestion or keep iterating.

Generating and optimizing datasets

If no dataset exists, Loop can create one automatically. You must have a task in order for Loop to generate a tailored dataset for the evaluation task.

Generate dataset

You can review the dataset and further refine it as needed.

After you run your playground, you can also ask Loop to optimize your dataset. The agent will provide various areas for optimizations based on an analysis of your current dataset.

Optimize dataset

Generating and editing scorers

If no scorers exist, Loop can create one for you. You must have a dataset and a task in order for Loop to generate a scorer that is specific to your use case. The agent will begin by checking what data you have, what existing scorers are available, and fetching some sample results to understand the data structure.

Create new scorer

If you select Accept, the new scorer will be added to the playground.

Loop can also help you improve and edit existing scorers.

Edit existing scorer

Loop

Selecting a model

Available tools

Coming soon

Generating and optimizing prompts

Generating and optimizing datasets

Generating and editing scorers

Run and assess evals

Mode

Continuous agent

On this page