CDN Architecture
CDNs reduce latency by caching static and dynamic content at globally distributed edge servers close to users.
- CDNs reduce latency by caching static and dynamic content at globally distributed edge servers close to users.
- Cache invalidation is a major operational challenge; use surrogate keys (Cache-Tags) for precise, real-time purging.
- An Origin Shield acts as a centralized caching layer to protect backend databases from "thundering herd" traffic spikes.
- Monitor Cache-Hit Ratio (CHR) as a primary metric; a low CHR indicates misconfigured TTLs or cache-busting queries.
The Problem
When a global user base accesses an application hosted in a single cloud region (e.g., US-East), they experience high latency due to the physical speed of light over fiber-optic cables. Static assets (images, JS, CSS) take hundreds of milliseconds to load, degrading the user experience. Furthermore, during high-traffic events (like a product launch or breaking news), millions of concurrent requests for the same assets hit the origin servers simultaneously. This "thundering herd" problem quickly exhausts backend CPU, memory, and database connection pools, leading to complete system collapse.
Core System Idea
A Content Delivery Network (CDN) architecture solves this by placing a globally distributed network of proxy servers (Edge POPs) between the users and the origin server.
When a user requests an asset, the request is routed to the geographically nearest edge server via Anycast DNS or latency-based routing. If the edge server has the asset cached (a cache hit), it returns it instantly, bypassing the origin entirely. If it is a cache miss, the edge server fetches the asset from the origin, caches it locally for future requests, and returns it to the user.
To protect the origin from concurrent cache misses on different edge servers, an Origin Shield (an intermediate caching layer) is placed between the edge servers and the origin, consolidating duplicate requests into a single upstream call.
System Flow
The CDN routes client requests to the nearest Edge Server, utilizing an Origin Shield to consolidate cache misses and protect the Origin Server from traffic spikes.
Real-World Examples Indicative
Cloudflare allows up to 30 Cache-Tag values per response via an HTTP header. When a product record updates, a single POST /zones/{zone_id}/purge_cache API call with {"tags": ["product-123"]} propagates the purge to all 300+ PoPs globally within ~150ms. GitHub uses this mechanism for documentation pages—when a PR merges, a webhook triggers a tag purge invalidating all cached pages under the affected repository path, ensuring users never see stale rendered content.
GitHub Pages serves static sites through Fastly. The GitHub Pages origin sets Surrogate-Key: repo-{id} user-{id} headers on every response. When a git push event occurs, GitHub's internal webhook service calls Fastly's instant purge API with the repo surrogate key, invalidating only that repository's pages globally in under 150ms. VCL (Varnish Configuration Language) enforces cache segmentation: authenticated GitHub.com requests carrying session cookies bypass the cache via Vary: Cookie, while anonymous requests are cached with stale-while-revalidate: 86400.
Netflix delivers 15+ Petabytes of content per day through its Open Connect CDN. Video files are stored on Amazon S3 in us-east-1. Instead of each of 1,000+ global PoPs hitting S3 directly on a cache miss, Netflix routes all PoP misses through a single Origin Shield cluster in us-east-1. This collapses N concurrent S3 requests into 1 per unique asset, reducing S3 GET request volume by ~70% and saving millions of dollars in egress costs annually. For major title releases, Netflix pre-positions content by proactively pushing files to PoPs before user traffic arrives, eliminating the cold-start thundering herd.
Anti-Patterns
Misconfiguring cache headers such that private user data (e.g., /dashboard or user-specific JSON responses) is cached at the edge and served to other users.
Setting extremely long TTL values on assets without implementing a programmatic cache invalidation strategy, leaving users stuck with stale content after deployments.
Allowing arbitrary query parameters (like ?timestamp=12345) to bypass the cache, which forces the CDN to treat every request as a cache miss, destroying the Cache-Hit Ratio.
Failing to configure an Origin Shield during high-traffic events, allowing hundreds of edge servers to query the origin simultaneously for the same expired asset.
Design Tradeoffs
| Dimension | Edge CDN Caching | Direct Origin Delivery |
|---|---|---|
| Latency | Sub-5ms response from the nearest PoP for cached assets; reduces cross-continental round-trips from 150ms+ to under 10ms | Full RTT to the origin datacenter on every request; 150-300ms for users in distant regions |
| Freshness control | Stale content risk if TTLs are misconfigured or purge fails; requires Cache-Tag pipelines and surrogate key discipline | Always fresh; the origin response is authoritative with no intermediate caching layer between user and data |
| Origin cost | 70-95% reduction in origin request volume (Netflix ~70% via Origin Shield); major egress bandwidth savings | Full origin bandwidth and compute cost on every request; no caching benefit to amortize across users |
Best Practices
Cache-Tag: product-123). This allows you to purge thousands of related pages instantly with a single API call when that entity updates.Cache-Control: max-age=600, stale-while-revalidate=30 to allow the CDN to serve stale content instantly while asynchronously fetching a fresh copy from origin in the background.When to Use / Avoid
| Use When | Avoid When |
|---|---|
| You have a globally distributed user base and serve static assets, media, or semi-static API responses. | Your application is used strictly within a single, localized corporate network or intranet. |
| You experience highly unpredictable traffic spikes (e.g., media sites, e-commerce, public APIs). | Your data is highly dynamic, personalized per user, and changes on every single request (e.g., real-time stock trading dashboards). |
| You want to reduce cloud egress costs by offloading bandwidth-heavy asset delivery to a CDN. | You do not have the operational capacity to manage cache invalidation pipelines and debug caching issues. |