If you searched "anthropic api 529 error," there's a good chance you're staring at something like this in your terminal right now:
⎿ API Error (529 {"type":"error","error":{"type":"overloaded_error","message":"Overloaded"}})
· Retrying in 1 seconds… (attempt 1/10)
⎿ API Error (529 {"type":"error","error":{"type":"overloaded_error","message":"Overloaded"}})
· Retrying in 2 seconds… (attempt 2/10)
Before you touch your code, your API key, your network, or anything else — read this first. Almost every frustrated developer who spends an hour debugging this error is debugging the wrong thing.
What Anthropic's API 529 Error?
According to Anthropic's official API documentation, HTTP 529 means one thing exactly: overloaded_error — Anthropic's API is temporarily overloaded.
The full error shape is:
json
{
"type": "error",
"error": {
"type": "overloaded_error",
"message": "Overloaded"
}
}
Newer versions also append a documentation link directly in the message:
json
{
"type": "error",
"error": {
"type": "overloaded_error",
"message": "Overloaded. https://docs.claude.com/en/api/errors"
},
"request_id": "req_011CZTPSK4MR5c9fkRz13TTk"
}
Note the request_id. Save it. If you ever need to escalate to Anthropic support, this is the single most useful piece of information you can provide.
The core meaning of 529 is this: Anthropic's infrastructure is temporarily out of capacity for the model you requested, across all users, regardless of your account tier or billing status. It is not about you. It is not about your code. It is not about your API key.
The Real-World Developer Experience (From GitHub & the Community)
Before diving into causes and fixes, it helps to understand how widespread this is — and how consistently developers misdiagnose it.
"I never faced this for the past 4 months and suddenly started getting this." — GitHub issue #39784 on the anthropics/claude-code repo, March 2026
This is one of the most common reports: working fine for months, then 529 appears out of nowhere. The developer hadn't changed anything. Their code was identical. This pattern is the clearest signal that the problem is external — a sudden shift in platform-wide load, not something in your application.
"I am getting this error on a daily basis. It looks a whole lot like I am paying $200 dollars a month for you all not to upgrade your server stacks." — GitHub issue #4145, July 2025
This captures the real frustration: 529 hits hardest on paid, heavy users — the exact developers whose workflows depend on reliability. The anger is understandable, but the technical reality is that 529 is a capacity-protection mechanism, not neglect.
"API Error: 529 Overloaded despite low session usage — I use session 32% (1h27m), week 4% (4d3h)." — GitHub issue #61368, May 2026
This is the most instructive example. A developer was well within their usage limits, burning only a fraction of their quota, and still hitting 529. This definitively proves: 529 is not a quota error. Your remaining sessions and tokens are irrelevant to whether 529 appears.
A production SRE who documented the error in detail put it plainly: "A 429 means you are sending too much. A 529 means the model's capacity pool is too small right now. Two failure modes, two very different retry policies."
How 529 Fits Into Anthropic's Full Error System?
Understanding 529 requires knowing where it sits relative to Anthropic's other HTTP errors:
CodeTypeWho Owns ItWhat It Means400invalid_request_errorYouRequest format or content problem401authentication_errorYouAPI key issue403permission_errorYouKey lacks permission for resource404not_found_errorYouResource doesn't exist413request_too_largeYouRequest body exceeds 32 MB429rate_limit_errorYouExceeded your org's rate quota500api_errorAnthropicUnexpected internal server failure529overloaded_errorAnthropicCapacity saturated across all users
The 4xx range is your responsibility. The 5xx range — including the non-standard 529 — is Anthropic's. Error handling logic that conflates these two categories will produce incorrect behavior: either retrying too aggressively on things you can't fix (client errors), or not retrying at all on things that are temporary and recoverable (server overload).
What Causes Anthropic API 529 Errors?
Global Traffic Spikes
Anthropic's API serves tens of thousands of developers simultaneously. When usage spikes globally — often correlated with new model releases, major AI news cycles, or product launches — capacity across the entire fleet becomes constrained. Anthropic's status page documents multiple such incidents across 2024–2026, affecting claude.ai, the API, the Console, and Claude Code simultaneously.
Real documented incidents include:
- July 2025: Elevated 529 errors specifically on Claude Sonnet 4, affecting api.anthropic.com and Claude Code
- November 2024: Two separate overload incidents within 48 hours, impacting Claude 3.5 Sonnet
- October 2024: Elevated error rates on Claude 3.5 Sonnet across the API, Claude.ai, and the Console, lasting roughly 70 minutes
These events aren't bugs — they're capacity moments that Anthropic detects, investigates, and resolves, usually within minutes to an hour.
Model-Specific Capacity Constraints
Capacity is tracked and enforced per model. Claude Opus 4, being the most computationally expensive model in the lineup, is far more susceptible to 529s than Claude Haiku, which has broader capacity headroom. When Opus hits its ceiling, Sonnet and Haiku may continue operating normally which is why model fallback is a valid mitigation strategy.
Request Bursting From Your Own Application
If your app sends a sudden burst of parallel requests — even if each one is small — you can contribute to and personally experience 529 responses. This happens in CI/CD pipelines, batch processing scripts, and agentic workflows where many tool calls are triggered simultaneously. The API sees a spike and throttles accordingly.
As one community deep-dive summarized: "The 529 status code signals that the server is overwhelmed and cannot handle the incoming requests due to excessive load or insufficient resources. This can be triggered by traffic spikes, inadequate server resources, inefficient code, or the absence of proper load balancing."
Third-Party Gateway Masking
If you're using a proxy, LLM gateway, or automation platform (Make, Zapier, n8n) between your code and the Anthropic API, you may see 529 translated into a generic failure or a different status code. Always test directly against api.anthropic.com before concluding the issue is in your integration layer.
How to Correctly Detect Anthropic's 529 in Code
The wrong way to catch overload errors is to check only the HTTP status code. The right way is to check both the status code and the error.type field, because Anthropic reserves the right to return other non-standard 5xx responses via the same HTTP codes.
Python (using the official Anthropic SDK):
python
import anthropic
client = anthropic.Anthropic()
try:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello"}]
)
except anthropic.APIStatusError as e:
if e.status_code == 529:
# This is specifically an overload event
print(f"Anthropic API overloaded. Request ID: {e.request_id}")
print("Check status.anthropic.com for active incidents.")
elif e.status_code == 429:
# This is a rate limit — different cause, different fix
print("Rate limit hit. Read retry-after header.")
elif e.status_code >= 500:
# Other server-side errors
print(f"Anthropic server error: {e.status_code}")
else:
# Client-side error — something in your request
raise
TypeScript:
typescript
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
async function callAPI() {
try {
return await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [{ role: "user", content: "Hello" }],
});
} catch (error: any) {
if (error.status === 529) {
// Platform-level overload — not your fault
console.error(`Overloaded. Request ID: ${error.headers?.["request-id"]}`);
} else if (error.status === 429) {
// Your account's rate limit
const retryAfter = error.headers?.["retry-after"];
console.error(`Rate limited. Retry after: ${retryAfter}s`);
} else {
throw error;
}
}
}
The most important detection nuance: If your requests-remaining rate limit header shows a non-zero value and you're still getting 529, this definitively confirms the issue is server-side capacity — not your quota. Stop looking at your account settings.
Reading the Response Headers During a 529
When you receive a 529, the response headers carry diagnostic information that most developers ignore:
anthropic-ratelimit-requests-limit: 4000 anthropic-ratelimit-requests-remaining: 3987 anthropic-ratelimit-requests-reset: 2025-11-03T12:34:56Z anthropic-ratelimit-tokens-limit: 400000 anthropic-ratelimit-tokens-remaining: 398420 retry-after: 12
If requests-remaining is near your limit: it might be a quota issue crossing into 529 territory. Slow down.
If requests-remaining is large (close to your limit): the problem is infrastructure capacity, not your quota. No account action will help.
If retry-after is present: use it literally. Many retry libraries ignore this header and fall back to their own delay logic — which may be far too short.
Checking Anthropic's Status Page (Do This First, Every Time)
Go to status.anthropic.com before anything else.
The status page shows real-time and historical component health for:
- claude.ai — the web chat interface
- api.anthropic.com — the API endpoint your code calls
- console.anthropic.com — the developer console
- Claude Code — the CLI tool
If there's an active incident marked as "Investigating" or "Identified," that's your answer. No code change will fix a platform-wide outage. The page also shows the resolution timeline for past incidents, which gives you a realistic sense of how long to expect to wait.
A key pattern from documented incidents: most 529 events resolve within 20–70 minutes once Anthropic identifies the issue. In the July 2025 Sonnet 4 incident, for example, Anthropic moved from "Investigating" to "Monitoring" to "Resolved" in under 45 minutes.
The Correct Retry Strategy for Anthropic 529 Errors
Once you've confirmed there's no active incident (or decided to keep retrying through one), here's the retry logic that actually works:
The pattern: exponential backoff with jitter
python
import anthropic
import time
import random
def anthropic_call_with_backoff(client, max_retries=5, **kwargs):
"""
Retry Anthropic API calls on 529 overloaded_error.
Uses exponential backoff with jitter to avoid thundering herd.
"""
for attempt in range(max_retries):
try:
return client.messages.create(**kwargs)
except anthropic.APIStatusError as e:
if e.status_code != 529:
raise # Don't retry non-overload errors this way
if attempt == max_retries - 1:
raise # Final attempt — surface the error
# Exponential backoff: 1s, 2s, 4s, 8s... capped at 60s
base_delay = min(60, 1 * (2 ** attempt))
jitter = random.uniform(0, 0.75 * base_delay)
wait_time = base_delay + jitter
print(
f"Anthropic 529 overloaded. "
f"Attempt {attempt + 1}/{max_retries}. "
f"Waiting {wait_time:.1f}s before retry."
)
time.sleep(wait_time)
The rules this implements:
- Only retry 529 errors — not 4xx client errors, not 500 internal errors (retry those separately with shorter budgets)
- Start small (1 second), double each attempt, cap at 60 seconds
- Add random jitter to prevent every failed client from retrying at the same instant
- Set a hard retry ceiling — don't retry indefinitely or you'll burn your entire compute budget waiting for an outage to end
- Log the attempt number and request ID so you have an audit trail
Model Fallback: The Production-Grade Mitigation
For applications where continuity matters more than consistent model quality, implementing a model fallback chain is the most resilient approach. When your primary model returns 529, automatically retry the same request against a smaller, less-loaded model.
python
FALLBACK_CHAIN = [
"claude-opus-4-6", # Primary: most capable, most load
"claude-sonnet-4-6", # First fallback: strong balance
"claude-haiku-4-5-20251001", # Last resort: fastest, most capacity
]
def call_with_model_fallback(client, prompt: str):
last_error = None
for model in FALLBACK_CHAIN:
try:
return client.messages.create(
model=model,
max_tokens=1024,
messages=[{"role": "user", "content": prompt}]
)
except anthropic.APIStatusError as e:
if e.status_code == 529:
last_error = e
print(f"{model} is overloaded, trying next model in chain...")
time.sleep(1) # Brief pause before trying next model
continue
raise # Non-529 errors shouldn't trigger model fallback
raise last_error # All models exhausted
Important caveats:
- Never silently downgrade models in quality-sensitive tasks (legal analysis, medical content, creative work with specific quality requirements)
- Log every fallback so you can audit how often it happens and which models bear the load
- If you're using this in production, your users should ideally know they may get Sonnet responses instead of Opus during high-load periods
Anthropic 529 in Context: How Often Does It Actually Happen?
Based on Anthropic's public status page history, documented incidents affecting the API occur roughly once to several times per month, with most resolving within an hour. The frequency has increased somewhat as the platform has grown and newer models (particularly Opus 4) have driven higher per-request compute loads.
The pattern matters for how you architect your application:
- Infrequent, short incidents (under 30 minutes): Simple exponential backoff handles this without user impact
- Longer incidents (30–90 minutes): Queuing with retry persistence, model fallback, or user-facing status messaging becomes necessary
- Concurrent model-specific incidents: Fallback chains handle this well, since typically not all models are affected simultaneously
For enterprise teams running high-volume workloads, Anthropic offers direct API access through AWS Bedrock and Google Cloud Vertex AI. These routes provide an additional layer of provider redundancy and may have different capacity pools than the direct API.
What Anthropic 529 Is NOT
Given how much time developers waste chasing the wrong cause, it's worth being explicit:
529 is NOT caused by:
- Your API key being invalid or revoked (that's 401)
- Your account being suspended (that's 403)
- Exceeding your rate limit (that's 429)
- Your request being too large (that's 413)
- A bug in your code or prompt
- Your internet connection or local network
- A problem with your Claude Code installation
- VPN or firewall settings
- Reinstalling the SDK or CLI
None of these fixes will resolve a 529, because 529 originates at Anthropic's infrastructure layer — past all of these variables.
Quick Diagnostics: Is Your 529 Platform-Wide or Self-Inflicted?
Run this two-step test:
Step 1: Open status.anthropic.com. If there's an active incident, wait. You're done.
Step 2: If status shows green, send a single minimal request using curl (bypassing all your application code):
bash
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{
"model": "claude-haiku-4-5-20251001",
"max_tokens": 10,
"messages": [{"role": "user", "content": "Hi"}]
}'
- If this succeeds: The problem is your application's traffic volume or concurrency. Reduce burst size and add backoff.
- If this also returns 529: The platform is under load even for small requests. Wait and retry after a few minutes.
This two-step test takes under 60 seconds and tells you definitively where the problem lives.
Summary
The Anthropic API 529 overloaded_error is a temporary, server-side capacity event — not an account, code, or network problem. It's one of the most commonly misdiagnosed errors in the Anthropic ecosystem, because developers reflexively look inward (at their code, keys, and billing) when the cause is entirely external.
The correct response is:
- Check status.anthropic.com first
- Detect it correctly in code by checking both status code and error type
- Read rate limit headers to confirm it's not a quota issue masquerading as overload
- Retry with exponential backoff and jitter — not tight loops
- Fall back to smaller models when continuity matters more than consistency
- Log
request_idvalues for every 529 occurrence
For most developers, most of the time, the fix is simply waiting a few minutes. The architecture work — proper retry logic, model fallback, request queuing — is what separates a production system from one that pages you at 3 AM when Sonnet has a bad afternoon.
Related: Claude API Error 529 Fix — Complete Tutorial with Code Examples