Applies to:
Summary
Issue: Self-hosted brainstore pods restart or fail, followed by cascading rate limit failures with429 errors on .managed.json objects in GCS. Multiple retry attempts per item exhaust the GCS write limit of 1 write/second per object.
Cause: Excessive retry attempts on individual objects trigger GCS rate limits, creating a cascading failure loop.
Resolution: Deploy the fix from commit SHA 40fda545aed99a12b1d04064c6205f8450d209ec which prevents multiple per-item retry attempts.
Resolution Steps
Deploy the fix using commit SHA
Step 1: Update Helm configuration
Set theimage.tag value in your Helm values.yaml to the fix commit SHA where a fix has been applied (alternatively, this will be included in a future official release):
Step 2: Re-apply deployment
Apply the updated Helm configuration using Terraform or directly with helm upgrade.Step 3: Monitor for rate limit errors
Check logs to verify429 errors on .managed.json objects no longer appear.
Alternative: Wait for official release
Step 1: Monitor changelog
The fix will be included in a future official data plane release.Step 2: Update to latest version
Update yourimage.tag to the latest semantic version when available.