When Caching Makes Performance Worse

When Caching Makes Performance Worse

Common pitfalls like cache stampedes, hot keys, and wrong invalidation strategies that slow down your system.

Caching is supposed to make systems faster. Add Redis, add memory, add a layer — done, right?

Not always.

In real systems, bad caching makes performance worse, not better. Sometimes slower, sometimes more fragile, sometimes both. This post is about when and why that happens.


Cache Is Not a Free Optimization

Caching trades:

  • CPU / IOMemory
  • FreshnessSpeed
  • SimplicityComplexity

If you don’t need that trade, caching is pure overhead.

Before caching, ask one question:

What bottleneck am I actually removing?

If the answer isn’t clear, stop.


1. Cache Misses Are Slower Than No Cache

A cache miss often means:

  1. Request hits cache
  2. Cache lookup fails
  3. Fallback to database
  4. Cache population
  5. Return response

That’s extra network hops + serialization + locking.

If your data:

  • Has low reuse
  • Is request-specific
  • Changes frequently

You pay cache cost every time, with almost no hits.

Result: slower than direct DB access.


2. Hot Keys Create Bottlenecks

Some keys are accessed far more than others:

  • “homepage”
  • “user:me”
  • “config”
  • “feature_flags”

Effects:

  • Single Redis shard melts
  • Lock contention
  • CPU spikes
  • Tail latency explodes

This is worse than a DB index scan because:

  • DBs are built for contention
  • Caches often aren’t

Caching concentrates load. Databases spread it.


3. Cache Stampede (Thundering Herd)

Classic failure mode:

  1. Cache entry expires
  2. 1,000 requests arrive at once
  3. All miss cache
  4. All hit database
  5. Database dies

Congratulations, your cache caused an outage.

Fixes exist (locks, request coalescing, stale-while-revalidate), but:

If you don’t design for stampedes, caching increases blast radius.


4. Invalidation Is Hard (And Slow)

Everyone knows the quote:

There are only two hard things in Computer Science: cache invalidation…

What’s often missed:

  • Invalidation is work
  • Work costs time
  • Time costs latency

Common problems:

  • Over-invalidating → cache useless
  • Under-invalidating → stale data bugs
  • Distributed invalidation → network storms

If correctness matters, cache logic can be more expensive than recomputing.


5. Serialization Overhead Eats the Gains

Caching isn’t free reads.

You pay for:

  • JSON / MsgPack / Protobuf encode
  • Decode on read
  • Memory allocation
  • GC pressure

For:

  • Small objects
  • Cheap queries
  • Indexed lookups

Serialization cost can exceed query cost.

Especially true in Go, Node.js, Python.


6. Cache Miss Latency Affects Tail Latency

Even if:

  • 95% of requests are fast
  • Cache hit ratio looks “good”

The slowest 5% define:

  • P95
  • P99
  • User perception

Caches increase latency variance:

  • Hits are very fast
  • Misses are much slower

Users feel spikes, not averages.


7. Wrong Cache Layer Choice

Caching at the wrong layer hurts:

  • Caching DB rows instead of computed results
  • Caching API responses instead of internal calls
  • Caching per-user data globally

You end up with:

  • Low hit ratio
  • High memory usage
  • Complex invalidation

Good cache rule:

Cache the most expensive thing you can safely recompute.


8. Memory Pressure Causes System-Wide Slowdowns

Caches grow. Memory fills. Kernel starts reclaiming. GC works harder. Latency jumps.

Worst part? You don’t see it in CPU metrics.

Symptoms:

  • Random slow requests
  • Sudden timeouts
  • “Nothing changed” incidents

Caching without memory limits is a slow failure.


When Caching Actually Helps

Caching is great when:

  • Reads heavily outweigh writes
  • Data changes slowly
  • Reuse is high
  • Latency matters more than freshness
  • You can tolerate stale data

Examples:

  • Auth tokens
  • Feature flags
  • Config
  • Rate limits
  • Aggregated results

Final Rule of Thumb

If you don’t know why you’re caching, don’t cache.

Measure first. Cache second. Design for failure. And remember: Caching is an optimization, not a requirement.

Sometimes the fastest system is the one with fewer layers.


Album of the blog: