Caching Strategies
Purpose
Reference for caching patterns, eviction policies, invalidation strategies, and distributed caching considerations. Caching is one of the highest-leverage performance tools but also a common source of subtle bugs when invalidation is wrong.
Architecture
Cache-Aside (Lazy Loading)
The most common pattern. Application code manages the cache explicitly.
def get_user(user_id: int) -> dict:
key = f"user:{user_id}"
cached = redis.get(key)
if cached:
return json.loads(cached) # cache hit
user = db.query("SELECT * FROM users WHERE id = %s", user_id) # cache miss
redis.setex(key, 300, json.dumps(user)) # store with 5-min TTL
return user
def update_user(user_id: int, data: dict):
db.execute("UPDATE users SET ... WHERE id = %s", user_id)
redis.delete(f"user:{user_id}") # invalidate on writeCharacteristics:
- Cache is only populated on demand — no wasted cache space for cold data.
- First request after expiry is always slow (cache miss penalty).
- Application code must handle cache-miss path; adds complexity.
- Risk: stale data if write path forgets to invalidate.
Read-Through
Cache sits between application and database; cache fetches from DB on miss automatically. Common in ORM-level caches (Hibernate second-level cache, ActiveRecord cache stores).
App → Cache → DB (on miss: Cache fetches from DB, stores, returns)
App → Cache (on hit: Cache returns directly)
- Transparent to application; cache logic centralised.
- First-time fetch is still slow (same cold-start as cache-aside).
- Cache must understand the data source — tighter coupling.
Write-Through
Every write goes to cache AND database synchronously before acknowledging the client.
def update_product(product_id, price):
db.execute("UPDATE products SET price=%s WHERE id=%s", price, product_id)
redis.setex(f"product:{product_id}:price", 3600, price) # write to cache too
return "ok"- Cache is always consistent with DB at the cost of write latency (two writes).
- Works best when read/write ratio is high and cache hit rate matters.
- Wastes memory: items are cached even if never read again.
Write-Back (Write-Behind)
Application writes to cache only; a background process asynchronously flushes to DB.
App → Cache (fast ack)
Background worker → DB (async, e.g., every 500ms or on eviction)
- Lowest write latency of all patterns.
- Risk: data loss if cache crashes before flush — acceptable for analytics, not for financial transactions.
- Useful for high-frequency counters (page views, likes) — batch flush to DB.
Write-Around
Writes go directly to DB, bypassing cache. Cache is populated only on subsequent reads.
- Good when written data is unlikely to be read soon (bulk imports, log data).
- Avoids cache pollution but guarantees cache miss on first read.
Implementation Notes
Eviction Policies
When the cache is full, Redis must evict entries. Set via maxmemory-policy in redis.conf.
| Policy | Behaviour | Best for |
|---|---|---|
noeviction | Return error on write when full | Never use in cache role |
allkeys-lru | Evict least-recently-used across all keys | General-purpose cache |
volatile-lru | LRU among keys with TTL set | Mixed TTL / permanent data |
allkeys-lfu | Evict least-frequently-used across all keys | Skewed access distribution |
volatile-lfu | LFU among keys with TTL | Mixed workloads |
volatile-ttl | Evict key closest to expiry | Prefer removing stale data first |
allkeys-random | Random eviction | Rarely useful |
# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru
LRU vs LFU:
- LRU is better when access recency predicts future access (most web caches).
- LFU is better when certain items are accessed frequently regardless of recency (viral content, popular products).
TTL as an eviction strategy:
# Always set a TTL — prevents unbounded growth and acts as a safety net
# for stale data even if explicit invalidation is missed
redis.setex(key, ttl_seconds, value)Cache Invalidation Strategies
“There are only two hard things in computer science: cache invalidation and naming things.”
1. TTL-based (time-to-live):
Simplest. Set an expiry; accept bounded staleness. Good for data that changes slowly
or where slightly stale data is acceptable (product catalog, exchange rates).
2. Event-driven invalidation:
Write path publishes an event; cache subscribers delete or update affected keys.
# Publisher (in write path)
redis.publish("invalidate:product", product_id)
# Subscriber (cache invalidation worker)
p = redis.pubsub()
p.subscribe("invalidate:product")
for msg in p.listen():
redis.delete(f"product:{msg['data']}")3. Versioned keys (cache busting):
Embed a version in the key. Changing the version effectively invalidates all old keys.
CACHE_VERSION = "v3"
def cache_key(resource_id):
return f"{CACHE_VERSION}:product:{resource_id}"Old keys are left to expire via TTL. Simple, no invalidation coordination needed. Drawback: old keys accumulate until TTL expires — monitor memory.
4. Write-through + delete-on-write (most reliable):
Update DB, then DEL cache key. Next read repopulates. Simpler than updating the
cached value (avoids race between concurrent writers).
Thundering Herd Problem
When a popular cached item expires, many concurrent requests miss the cache simultaneously and all hit the database at once.
Solution 1: Mutex / distributed lock
import redis_lock
def get_expensive(key):
val = redis.get(key)
if val:
return val
with redis_lock.Lock(redis, f"lock:{key}", expire=5):
# Double-check after acquiring lock
val = redis.get(key)
if val:
return val
val = expensive_db_query()
redis.setex(key, 300, val)
return valSolution 2: Probabilistic Early Expiration (PER)
Jitter the effective TTL so different instances re-fetch at slightly different times.
import math, random, time
def fetch_with_per(key, ttl, beta=1.0):
result = redis.get(key) # returns (value, remaining_ttl) if stored with expiry
if result is None:
return recompute_and_store(key, ttl)
value, remaining = result
# Recompute early with probability proportional to recomputation time
if -beta * recompute_time * math.log(random.random()) >= remaining:
return recompute_and_store(key, ttl)
return valueSolution 3: Stale-while-revalidate
Serve stale data while refreshing asynchronously in the background.
Common in HTTP caches (Cache-Control: stale-while-revalidate=60).
Redis Data Structures Quick Reference
# String — scalar values, counters, serialised JSON
redis.set("key", "value", ex=60)
redis.incr("counter")
redis.incrby("counter", 5)
# Hash — object fields; memory-efficient for small objects
redis.hset("user:1", mapping={"name": "Ada", "score": 100})
redis.hincrby("user:1", "score", 10)
redis.hgetall("user:1")
# List — queues, stacks, recent activity log
redis.rpush("queue", "task1", "task2") # enqueue
redis.blpop("queue", timeout=0) # blocking dequeue (worker)
redis.lrange("feed:user:1", 0, 49) # last 50 feed items
# Set — tags, unique visitors, deduplication
redis.sadd("online_users", "user:1", "user:2")
redis.scard("online_users") # count
redis.srandmember("online_users", 3) # random sample
# Sorted Set — leaderboards, rate limiting, priority queues
redis.zadd("scores", {"alice": 9800.0, "bob": 9200.0})
redis.zrevrangebyscore("scores", "+inf", "-inf", start=0, num=10)
redis.zincrby("scores", 100, "alice")
# Stream — append-only event log; consumer groups
redis.xadd("events:clicks", {"user_id": "42", "page": "/home"})
redis.xreadgroup("mygroup", "worker-1", {"events:clicks": ">"}, count=100)Distributed Caching Considerations
Consistent Hashing:
Partitions key space across nodes; adding/removing a node only remaps ~1/N keys.
Libraries: uhashring (Python), native in Redis Cluster.
Redis Cluster:
- Automatically shards across 16384 hash slots.
- Multi-key operations (MGET, pipeline) must target the same slot — use hash tags
{user:42}. - No cross-slot transactions.
Read replicas:
redis.StrictRedis(host="replica-1") # direct replica reads for non-critical paths
Cache stampede across services: In microservice architectures, different services may cache the same backing data. Use a shared cache tier (Redis) and a consistent invalidation event bus (Kafka, Redis Pub/Sub).
Cache warming:
# Pre-populate cache before traffic hits — prevents cold-start stampede
def warm_cache():
popular_ids = db.query("SELECT id FROM products ORDER BY views DESC LIMIT 1000")
for pid in popular_ids:
get_product(pid) # normal read-through warms cache as side effectTrade-offs
| Strategy | Consistency | Write Latency | Complexity |
|---|---|---|---|
| Cache-aside | Eventual | Normal | Low |
| Read-through | Eventual | Normal | Low–Medium |
| Write-through | Strong | Higher (+cache write) | Medium |
| Write-back | Weak (data loss risk) | Lowest | High |
| Write-around | Eventual | Normal | Low |
Key decisions:
- TTL length: short TTL → more DB load, less stale data; long TTL → inverse.
- In-process vs external cache: in-process (e.g.,
functools.lru_cache) is fastest but not shared across instances; Redis is shared but adds network latency (~0.5 ms). - Serialisation format: MessagePack / protobuf over JSON for high-throughput caches (faster ser/de, smaller payloads).
References
- Redis documentation — Data types
- Redis documentation — Eviction policies
- Thundering Herd — optimal probabilistic cache stampede prevention
- Designing Data-Intensive Applications — Martin Kleppmann (Ch. 5)
- AWS ElastiCache Caching Strategies