> ## Documentation Index
> Fetch the complete documentation index at: https://braintrust.dev/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# loadPrompt caching behavior in TypeScript SDK

export const plans_0 = "Any"

export const deployments_0 = "Any"

export const data_plane_version_0 = undefined

export const use_case_0 = "Use case - Applications calling loadPrompt with an environment parameter that expect caching to prevent outages or rate limit errors"

<Note>
  **Applies to:**

  * Plan - {plans_0}
  * Deployment - {deployments_0}
  * {data_plane_version_0}
  * {use_case_0}
</Note>

# `loadPrompt` caching behavior

## Summary

**Goal:** Understand how `loadPrompt` caching works, when the SDK falls back to the local cache.

**Features:** Local prompt caching, network-first loading, environment/version-aware fallback behavior, configurable cache location and size.

## How `loadPrompt` caching works

On every call, the TypeScript SDK:

1. Calls the API over the network first (`/v1/prompt` for slug-based loads, `/v1/prompt/{id}` for id-based loads).
2. On success, parses the response and **writes** it to a local cache.
3. On failure, **reads** from the local cache as a fallback — but only when neither `version` nor `environment` was passed.

The cache is purely a **resilience mechanism**, not a request-deduplication mechanism. Successful calls always go to the network; the local cache is only read when the API call throws.

> The cache is local to the machine running the SDK. It is not a server-side cache, and it is not shared between processes via Braintrust.

## Cache layout

The cache has two layers:

| Layer        | Lifetime                                                     | Notes                                                                      |
| ------------ | ------------------------------------------------------------ | -------------------------------------------------------------------------- |
| Memory (LRU) | Lives inside the SDK's `BraintrustState` object              | Wiped when the process exits. Not shared between processes.                |
| Disk         | Survives process restarts as long as the filesystem persists | One gzip-compressed JSON file per entry, named by a hash of the cache key. |

**Default disk location:** `~/.braintrust/prompt_cache/`

**Configuration (environment variables):**

* `BRAINTRUST_PROMPT_CACHE_DIR` — directory for the disk cache. Default `~/.braintrust/prompt_cache`.
* `BRAINTRUST_PROMPT_CACHE_MEMORY_MAX` — maximum entries in the in-memory LRU. Default `1024`.
* `BRAINTRUST_PROMPT_CACHE_DISK_MAX` — maximum entries on disk before LRU eviction by mtime. Default `1048576`.

The disk layer is automatically disabled in runtimes without a usable filesystem API (browsers, Cloudflare Workers, Vercel Edge, and similar sandboxes). In those environments, caching is memory-only and effectively scoped to a single request handler / cold-start lifetime.

### Why pinned reads bypass the cache

When you pass `version` or `environment`, you are asserting *"give me exactly this revision."* If the API is unreachable, the SDK has no way to verify that the cached entry still corresponds to that pin (the server-side mapping for an environment may have moved, etc.), so it fails loudly rather than silently substituting a possibly-different value.
