When traffic spikes hit your app, you basically have two choices: slow down gracefully or crash in style. Load shedding and backpressure are the two patterns that stop things from blowing up. Most teams ignore them… until the first outage.
Let’s break them down in a practical, dev-friendly way.
Why systems fall over under load
Every app has limits: CPU, RAM, DB connections, thread pools, queue sizes. When requests come in faster than you can process them, you get:
- Latency spikes
- Timeout storms
- Cascading failures
- DB overload and connection exhaustion
- Full-on meltdown
Load shedding and backpressure help you stay in control instead of drowning.
What is Load Shedding?
Load shedding = dropping some requests on purpose when the system is overloaded.
It sounds harsh, but it’s way better than letting everything fail.
Common strategies
- Reject early (HTTP 503, gRPC UNAVAILABLE)
- Rate-limit per user/IP
- Shed non-critical traffic (search suggestions, analytics, email sending)
- Timeout fast for slow downstream calls
- Use circuit breakers to avoid calling services that are already on fire
In practice, this isolates the important stuff: checkout flows, core APIs, auth, etc.
When to use it
- Traffic spikes (campaigns, holidays, bot attacks)
- Limited resources (small DB pools, expensive external APIs)
- You must protect “golden paths”
Load shedding is defensive. You’re choosing which requests die so the rest can live.
What is Backpressure?
Backpressure = telling the caller to slow down when you can’t keep up.
Think of it as a polite “bro, chill” message from your service.
How it works
- Your service notices its queues/buffers are growing.
- It stops accepting new work or responds with a “retry later” signal.
- Callers back off instead of hammering you harder.
This prevents runaway queue growth and latency explosions.
Examples in real systems
- TCP has built-in backpressure via flow control.
- Kafka slows producers when brokers lag.
-
Node.js streams
pause()anddrain()when buffers fill. - gRPC backpressure via stream flow control.
Backpressure works best when both sides “speak the same protocol” and respect the signal.
Load Shedding vs Backpressure
| Concept | Purpose | Behavior |
|---|---|---|
| Load Shedding | Protect the system by dropping requests | “I can’t do this. Goodbye.” |
| Backpressure | Slow the sender before overload happens | “Wait a sec, I need to catch up.” |
Load shedding is reactive. Backpressure is proactive.
Most production systems use both.
Practical patterns you should actually implement
1. Incoming request queue limits
Never let your server accept infinite in-flight requests. Cap it based on CPU/threads.
2. Timeout every external call
A slow dependency is worse than a dead one.
3. Connection pool limits
Protect your DB by rejecting requests before the pool starves.
4. Rate limiting
Use token bucket or leaky bucket. Rate-limit per user/session if needed.
5. Circuit breakers
Stop retry storms. If something is failing, stop calling it for a bit.
6. Queue-based workloads
If you’re processing heavy jobs, buffer them in a message broker and let backpressure regulate the producers.
What engineers usually get wrong
- Trying to “optimize” their way out of overload instead of limiting it
- Allowing unbounded queues (hello memory leak)
- Adding retries without backoff (DDOS’ing themselves)
- Treating all traffic as equally important
- Assuming autoscaling solves everything (it doesn’t, especially with DB bottlenecks)
Final Thoughts
Load shedding and backpressure are boring until they’re not. They’re the difference between:
“We handled the spike.” and “We were down for 45 minutes and nobody knows why.”
If you want predictable, stable performance under real-world traffic, build these safety valves early. Your future self will thank you.
Album of the day:




