GCS Brainstore rate limit errors and retries

Applies to:

Plan -
Deployment -

Summary

Issue: Self-hosted brainstore pods restart or fail, followed by cascading rate limit failures with 429 errors on .managed.json objects in GCS. Multiple retry attempts per item exhaust the GCS write limit of 1 write/second per object.

Cause: Excessive retry attempts on individual objects trigger GCS rate limits, creating a cascading failure loop.

Resolution: Deploy the fix from commit SHA 40fda545aed99a12b1d04064c6205f8450d209ec which prevents multiple per-item retry attempts.

Resolution Steps

Deploy the fix using commit SHA

Step 1: Update Helm configuration

Set the image.tag value in your Helm values.yaml to the fix commit SHA where a fix has been applied (alternatively, this will be included in a future official release):

image:
    tag: "40fda545aed99a12b1d04064c6205f8450d209ec"

Step 2: Re-apply deployment

Apply the updated Helm configuration using Terraform or directly with helm upgrade.

Step 3: Monitor for rate limit errors

Check logs to verify 429 errors on .managed.json objects no longer appear.

Alternative: Wait for official release

Step 1: Monitor changelog

The fix will be included in a future official data plane release.

Step 2: Update to latest version

Update your image.tag to the latest semantic version when available.

​Summary

​Resolution Steps

​Deploy the fix using commit SHA

​Step 1: Update Helm configuration

​Step 2: Re-apply deployment

​Step 3: Monitor for rate limit errors

​Alternative: Wait for official release

​Step 1: Monitor changelog

​Step 2: Update to latest version