Caching is supposed to make systems faster. Add Redis, add memory, add a layer — done, right?
Not always.
In real systems, bad caching makes performance worse, not better. Sometimes slower, sometimes more fragile, sometimes both. This post is about when and why that happens.
Cache Is Not a Free Optimization
Caching trades:
- CPU / IO → Memory
- Freshness → Speed
- Simplicity → Complexity
If you don’t need that trade, caching is pure overhead.
Before caching, ask one question:
What bottleneck am I actually removing?
If the answer isn’t clear, stop.
1. Cache Misses Are Slower Than No Cache
A cache miss often means:
- Request hits cache
- Cache lookup fails
- Fallback to database
- Cache population
- Return response
That’s extra network hops + serialization + locking.
If your data:
- Has low reuse
- Is request-specific
- Changes frequently
You pay cache cost every time, with almost no hits.
Result: slower than direct DB access.
2. Hot Keys Create Bottlenecks
Some keys are accessed far more than others:
- “homepage”
- “user:me”
- “config”
- “feature_flags”
Effects:
- Single Redis shard melts
- Lock contention
- CPU spikes
- Tail latency explodes
This is worse than a DB index scan because:
- DBs are built for contention
- Caches often aren’t
Caching concentrates load. Databases spread it.
3. Cache Stampede (Thundering Herd)
Classic failure mode:
- Cache entry expires
- 1,000 requests arrive at once
- All miss cache
- All hit database
- Database dies
Congratulations, your cache caused an outage.
Fixes exist (locks, request coalescing, stale-while-revalidate), but:
If you don’t design for stampedes, caching increases blast radius.
4. Invalidation Is Hard (And Slow)
Everyone knows the quote:
There are only two hard things in Computer Science: cache invalidation…
What’s often missed:
- Invalidation is work
- Work costs time
- Time costs latency
Common problems:
- Over-invalidating → cache useless
- Under-invalidating → stale data bugs
- Distributed invalidation → network storms
If correctness matters, cache logic can be more expensive than recomputing.
5. Serialization Overhead Eats the Gains
Caching isn’t free reads.
You pay for:
- JSON / MsgPack / Protobuf encode
- Decode on read
- Memory allocation
- GC pressure
For:
- Small objects
- Cheap queries
- Indexed lookups
Serialization cost can exceed query cost.
Especially true in Go, Node.js, Python.
6. Cache Miss Latency Affects Tail Latency
Even if:
- 95% of requests are fast
- Cache hit ratio looks “good”
The slowest 5% define:
- P95
- P99
- User perception
Caches increase latency variance:
- Hits are very fast
- Misses are much slower
Users feel spikes, not averages.
7. Wrong Cache Layer Choice
Caching at the wrong layer hurts:
- Caching DB rows instead of computed results
- Caching API responses instead of internal calls
- Caching per-user data globally
You end up with:
- Low hit ratio
- High memory usage
- Complex invalidation
Good cache rule:
Cache the most expensive thing you can safely recompute.
8. Memory Pressure Causes System-Wide Slowdowns
Caches grow. Memory fills. Kernel starts reclaiming. GC works harder. Latency jumps.
Worst part? You don’t see it in CPU metrics.
Symptoms:
- Random slow requests
- Sudden timeouts
- “Nothing changed” incidents
Caching without memory limits is a slow failure.
When Caching Actually Helps
Caching is great when:
- Reads heavily outweigh writes
- Data changes slowly
- Reuse is high
- Latency matters more than freshness
- You can tolerate stale data
Examples:
- Auth tokens
- Feature flags
- Config
- Rate limits
- Aggregated results
Final Rule of Thumb
If you don’t know why you’re caching, don’t cache.
Measure first. Cache second. Design for failure. And remember: Caching is an optimization, not a requirement.
Sometimes the fastest system is the one with fewer layers.
Album of the blog:




