Consistent Hashing
Minimize data movement during cluster scaling by using consistent hashing instead of modulo hashing.
- Minimize data movement during cluster scaling by using consistent hashing instead of modulo hashing.
- Prevent load imbalance across physical nodes by mapping them to multiple virtual nodes on the hash ring.
- Route requests with O(log N) lookup complexity using client-side or proxy-side ring routing.
- Mitigate cache stampedes and hot spots by dynamically replicating highly requested keys across adjacent ring nodes.
The Problem
When scaling a cluster of caching servers (like Memcached) or distributed databases, traditional modulo hashing (hash(key) mod N) fails catastrophically when the number of nodes (N) changes. If a single node crashes or a new node is added, almost every key hashes to a different node.
This triggers a massive cache wipeout, forcing downstream databases to handle a sudden flood of read requests, which often leads to database exhaustion and system-wide cascading failures.
Core System Idea
Consistent hashing solves this by mapping both the keys and the physical nodes to a circular hash ring with a fixed hash space (e.g., 0 to 2^32 - 1). To find which node stores a key, the system hashes the key and traverses the ring clockwise until it encounters the first node whose hash is greater than or equal to the key's hash.
When a node is added or removed, only 1/N of the keys need to be remapped to different physical servers.
To prevent uneven distribution of keys (load imbalance) caused by non-uniform node distribution on the ring, consistent hashing introduces "virtual nodes" (vnodes). Each physical node is assigned multiple virtual nodes spread across the ring. This ensures that if a physical node fails, its load is distributed evenly among all remaining nodes rather than overloading a single clockwise neighbor.
System Flow
Consistent hashing ring routing a key to its clockwise virtual node, minimizing data movement during node additions.
Real-World Examples Indicative
Apache Cassandra uses MurmurHash3 to place 128 virtual tokens per node on a hash ring ranging from -(2^63) to +(2^63)-1. When a new node joins a 10-node cluster, it claims ~128 of the 1,280 total tokens and receives only the data for those token ranges rather than a full copy. Ring convergence — all nodes agreeing on the new topology via gossip — completes in ~10 seconds for a 100-node cluster. DataStax benchmarks confirm that adding one node to a 10-node cluster transfers exactly ~10% of total data, matching the theoretical 1/N prediction and avoiding the full-cluster reshuffles that modulo hashing would require.
Discord's Memcached tier uses a consistent hashing client with 150 virtual nodes per physical server. Before switching from modulo hashing in 2017, a single server failure invalidated ~92% of cached game session data as nearly every key remapped to a different node. With 150 vnodes, a node failure now remaps only the 1/N keys owned by that node to its clockwise neighbors. During a 2022 major game-launch peak, this prevented the cascading database overload that the old modulo scheme would have triggered — only the affected node's key range hit the database as a cold-start miss.
DynamoDB uses consistent hashing internally to distribute partition keys across storage nodes as virtual node ranges. When partition monitoring detects a hot partition receiving disproportionate traffic, adaptive capacity automatically remaps that virtual node range to a dedicated isolated node — no customer action required. During the 2020 AWS re:Invent workload surge, DynamoDB performed 3,000+ automatic hot partition splits per hour, redistributing read/write load without any customer-visible changes or key migrations, because each virtual node can be reassigned to a new physical node via a ring metadata update alone.
Anti-Patterns
Configuring a low number of virtual nodes (e.g., fewer than 50 per physical node) leads to severe load imbalance, where some servers run out of memory while others sit idle.
Treating servers with different CPU/RAM specifications equally on the ring causes weaker servers to crash under standard load.
Hardcoding the list of nodes in client applications prevents dynamic scaling and requires full application redeployments to add or remove servers.
Allowing split-brain scenarios where different clients have different views of the hash ring leads to inconsistent writes and cache misses.
Design Tradeoffs
| Dimension | Consistent Hashing | Modulo Hashing |
|---|---|---|
| Key remapping on resize | Only 1/N keys remapped to clockwise neighbors; remaining cache stays valid across the cluster | Nearly 100% of keys remap to different nodes on any cluster size change, causing a full cache wipeout |
| Lookup complexity | O(log N) binary search on a sorted ring; slight overhead in client or proxy ring traversal | O(1); compute hash mod N — no ring data structure or sorted search required |
| Load uniformity | With 128+ virtual nodes per server, load variance across physical nodes stays within ~5% of average | Uniform only when cluster size is fully static; any addition or removal destroys key distribution |
| Hot node failure blast radius | Failed node's load distributes across all servers proportionally via vnode spreading | All traffic for the failed node's keys hits a single clockwise neighbor, potentially causing cascading overload |
Best Practices
When to Use / Avoid
| Use When | Avoid When |
|---|---|
| Building large-scale distributed caching layers where node membership changes dynamically. | The cluster size is completely static and nodes are never added or removed. |
| Designing distributed, partition-tolerant databases that require horizontal scaling. | The dataset is small enough to fit on a single machine, making distribution unnecessary. |
| Implementing load balancers that require session affinity without storing session state. | The application requires strict, global ordering of keys across all physical nodes. |