When Caching Makes Performance Worse

Caching is supposed to make systems faster. Add Redis, add memory, add a layer — done, right?

Not always.

In real systems, bad caching makes performance worse, not better. Sometimes slower, sometimes more fragile, sometimes both. This post is about when and why that happens.

Cache Is Not a Free Optimization

Caching trades:

CPU / IO → Memory
Freshness → Speed
Simplicity → Complexity

If you don’t need that trade, caching is pure overhead.

Before caching, ask one question:

What bottleneck am I actually removing?

If the answer isn’t clear, stop.

1. Cache Misses Are Slower Than No Cache

A cache miss often means:

Request hits cache
Cache lookup fails
Fallback to database
Cache population
Return response

That’s extra network hops + serialization + locking.

If your data:

Has low reuse
Is request-specific
Changes frequently

You pay cache cost every time, with almost no hits.

Result: slower than direct DB access.

2. Hot Keys Create Bottlenecks

Some keys are accessed far more than others:

“homepage”
“user:me”
“config”
“feature_flags”

Effects:

Single Redis shard melts
Lock contention
CPU spikes
Tail latency explodes

This is worse than a DB index scan because:

DBs are built for contention
Caches often aren’t

Caching concentrates load. Databases spread it.

3. Cache Stampede (Thundering Herd)

Classic failure mode:

Cache entry expires
1,000 requests arrive at once
All miss cache
All hit database
Database dies

Congratulations, your cache caused an outage.

Fixes exist (locks, request coalescing, stale-while-revalidate), but:

If you don’t design for stampedes, caching increases blast radius.

4. Invalidation Is Hard (And Slow)

Everyone knows the quote:

There are only two hard things in Computer Science: cache invalidation…

What’s often missed:

Invalidation is work
Work costs time
Time costs latency

Common problems:

Over-invalidating → cache useless
Under-invalidating → stale data bugs
Distributed invalidation → network storms

If correctness matters, cache logic can be more expensive than recomputing.

5. Serialization Overhead Eats the Gains

Caching isn’t free reads.

You pay for:

JSON / MsgPack / Protobuf encode
Decode on read
Memory allocation
GC pressure

For:

Small objects
Cheap queries
Indexed lookups

Serialization cost can exceed query cost.

Especially true in Go, Node.js, Python.

6. Cache Miss Latency Affects Tail Latency

Even if:

95% of requests are fast
Cache hit ratio looks “good”

The slowest 5% define:

P95
P99
User perception

Caches increase latency variance:

Hits are very fast
Misses are much slower

Users feel spikes, not averages.

7. Wrong Cache Layer Choice

Caching at the wrong layer hurts:

Caching DB rows instead of computed results
Caching API responses instead of internal calls
Caching per-user data globally

You end up with:

Low hit ratio
High memory usage
Complex invalidation

Good cache rule:

Cache the most expensive thing you can safely recompute.

8. Memory Pressure Causes System-Wide Slowdowns

Caches grow. Memory fills. Kernel starts reclaiming. GC works harder. Latency jumps.

Worst part? You don’t see it in CPU metrics.

Symptoms:

Random slow requests
Sudden timeouts
“Nothing changed” incidents

Caching without memory limits is a slow failure.

When Caching Actually Helps

Caching is great when:

Reads heavily outweigh writes
Data changes slowly
Reuse is high
Latency matters more than freshness
You can tolerate stale data

Examples:

Auth tokens
Feature flags
Config
Rate limits
Aggregated results

Final Rule of Thumb

If you don’t know why you’re caching, don’t cache.

Measure first. Cache second. Design for failure. And remember: Caching is an optimization, not a requirement.

Sometimes the fastest system is the one with fewer layers.

Album of the blog: