← System Design Data & Messaging Systems
System Design

Read/Write Patterns at Scale

Scale read capacity by routing queries to read replicas while managing replication lag expectations.

TL;DR
  • Scale read capacity by routing queries to read replicas while managing replication lag expectations.
  • Absorb massive write spikes using write-behind caching backed by durable message queues.
  • Mitigate hot-key bottlenecks by implementing cache leases or local memory caching.
  • Guarantee read-your-writes consistency by routing post-write reads to the primary database.

The Problem

As application traffic scales, databases become the ultimate bottleneck. Simply adding more application servers does nothing to solve database CPU exhaustion caused by millions of read queries, nor does it prevent write lock contention when thousands of users update the same database rows simultaneously.

Furthermore, naive caching strategies introduce race conditions, stale data, and cache stampedes that crash downstream databases the moment a popular cache key expires.

Core System Idea

Scaling reads and writes requires decoupling the data access patterns using specialized architectural patterns.

For read-heavy workloads, we implement Read Replicas. Writes go to a single primary database, which asynchronously replicates changes to multiple read-only instances. To prevent users from seeing stale data immediately after they make an update (due to replication lag), we use "read-your-writes" consistency routing: reads occurring within a short window after a write are routed directly to the primary database, while all other reads go to replicas.

For write-heavy workloads, we use Write-Behind (Write-Back) Caching. The application writes directly to a high-speed cache layer and immediately returns success to the user. An asynchronous worker pool then pulls these write events from a queue and batches them into the primary database.

To handle "hot keys" (highly popular items like viral posts), we use Cache Leases (or single-flight mutexes) to ensure only one application thread queries the database to populate the cache, while other threads wait for the cached result.

System Flow

flowchart TD A["Client"] -->|"Write Request"| B["Primary DB"] B -->|"Async Replication"| C["Read Replica"] A -->|"Read within 5s of Write"| B A -->|"Standard Read Request"| C D["Client Write Spike"] -->|"Write"| E["Write-Behind Cache"] E -->|"Enqueue Mutation"| F["Durable Queue"] F -->|"Batch Write"| B

Read-replica routing with write-behind caching to decouple read and write paths under high load.

Real-World Examples Indicative

Twitter hybrid fan-out — write for normal users, read for celebrities

Twitter's home timeline uses fan-out-on-write for accounts with fewer than ~20M followers: when a user tweets, the tweet ID is pre-written to each follower's Redis timeline cache. For high-follower accounts (>20M), Twitter switches to fan-out-on-read: the celebrity's tweets are not pre-fanned; instead, a timeline read merges the pre-computed cache with a real-time pull from the celebrity's tweet list. Without this hybrid model, a single tweet from a 100M-follower account would trigger 100M Redis writes within seconds — a write amplification Twitter measured as 100x the infrastructure cost of a normal user tweet.

Facebook TAO + mcrouter lease tokens

Facebook's TAO (The Associations and Objects) cache layer sits in front of MySQL for social graph data. To prevent stampedes on hot nodes (e.g., a viral post), TAO's mcrouter implements a lease mechanism: when a cache miss occurs, the first requesting client receives a 10ms exclusive lease token. All other clients requesting the same key during that window receive a "wait" response and retry after the lease holder has re-populated the cache from MySQL. This single lease-per-key guarantee collapses thousands of parallel DB fan-out queries into one, protecting MySQL from the thundering herd that a standard TTL expiry would cause on hot social objects.

Instagram write-behind for like counts — 10K likes/sec

Instagram processes like counts using a write-behind pattern. When a user likes a photo, the like count is incremented atomically in Redis with INCR photo:{id}:likes. An async worker pool reads deltas from Redis every 30 seconds and bulk-writes them to PostgreSQL with UPDATE photos SET likes_count = likes_count + $delta WHERE id = $id. This allows Instagram to absorb 10K+ likes/sec on viral posts without hitting PostgreSQL directly — a rate that would exhaust PostgreSQL row-lock capacity on a single high-traffic photo row.

Anti-Patterns

Querying Read Replicas for Immediate Post-Write Verification

Directing a user to a page that reads from a replica immediately after they submitted a form leads to confusing "missing data" bugs due to replication lag.

Write-Behind Caching Without a Durable Queue

Writing directly to an in-memory cache and relying on an in-memory background thread to update the database risks total data loss if the application container crashes.

Cache Invalidation Without Versioning

Deleting cache keys without using version tokens or CAS (Compare-And-Swap) operations leads to race conditions where stale database values overwrite fresh cache data.

Unbounded Fan-Out on Write

Implementing fan-out-on-write for users with millions of followers (the "celebrity problem") causes massive write spikes and system-wide latency.

Design Tradeoffs

DimensionFan-Out on Write (Push)Fan-Out on Read (Pull)
Read latencyO(1) lookup from a pre-computed per-user cache; feeds are served without any query-time computationSlower; requires fetching and merging data from multiple sources at query time
Write amplificationHigh; a single write is duplicated across every follower's cache — catastrophic at 100M+ followersLow; writes touch a single location; fan-out cost is deferred to read time
Storage and consistencyHigh storage and complex cache invalidation for inactive users who may never read their pre-computed feedSimple storage; data is computed and cached only when actually requested by an active user

Best Practices

Implement Session-Based RoutingTrack the timestamp of a user's last write in their session cookie, and route all reads to the primary database if the replica's replication lag is greater than the time elapsed since the write.
Use Single-Flight / Cache LeasesWhen a cache miss occurs for a hot key, use a distributed lock or local mutex so only one worker queries the database, preventing database collapse.
Batch Writes in Write-Behind QueuesGroup individual write events into multi-row inserts (e.g., INSERT INTO ... VALUES ...) to maximize database write throughput.
Add Jitter to Cache TTLsRandomize cache expiration times (e.g., 1 hour +/- 5 minutes) to prevent all cache keys from expiring simultaneously and creating a cache stampede.
Monitor Replication Lag MetricsSet up automated alerts that trigger when database replication lag exceeds acceptable thresholds (e.g., greater than 5 seconds).

When to Use / Avoid

Use WhenAvoid When
The application has a high read-to-write ratio (e.g., e-commerce catalogs, social media feeds).The application requires strict, immediate ACID consistency across all read operations.
Handling massive, unpredictable spikes in writes (e.g., IoT sensor data, real-time gaming telemetry).Writes must be immediately validated against complex database constraints before returning success.
Serving highly repetitive queries that can be naturally cached near the user.The dataset is highly dynamic, unique per request, and rarely read more than once.