← System Design Backend Architectures
System Design

Authentication at Scale

Stateless JWTs eliminate database lookups for token verification but cannot be instantly revoked — the hybrid approach uses short-lived JWTs (5–15 min) verified in-process, combined with stateful refresh tokens rotated on every use to bound the compromise window.

TL;DR
  • Stateless JWTs eliminate database lookups for token verification but cannot be instantly revoked. The compromise window equals the token's TTL — keep access tokens at 5–15 minutes, never hours.
  • Auth0's refresh token rotation with reuse detection: when a refresh token is used, it is immediately invalidated and a new one issued. If the old (already-used) token is presented again, the entire token family is revoked — indicating the token was stolen and replayed.
  • Google's internal "Passport" token propagates user identity through the service call chain as a signed binary proto — each service validates in-process using a cached public key, with zero network calls to a central auth service.
  • Never store sensitive data (PII, permissions, role assignments) in the JWT payload. JWTs are base64-encoded, not encrypted — the payload is readable by anyone who intercepts the token without even decoding it properly.
  • The auth service is a critical-path dependency for every request. It must be deployed multi-zone HA with JWKS keys cached at the API gateway. A 500ms auth service outage means every API request fails for its duration.

The Problem

A company's monolith used server-side sessions stored in a single Redis instance. Migrating to microservices, they switched to stateless JWTs with 24-hour expiry to avoid Redis as a dependency. Three months later, a security incident requires revoking all active tokens for a compromised user account. There is no revocation mechanism — the attacker retains access for up to 24 hours after the password is reset. The team builds an emergency Redis blacklist in production under incident pressure, discovers it adds 3ms of Redis latency to every request, and later finds that 30 services bypass it because they validate JWTs before the gateway's blacklist check.

Core System Idea

Authentication at scale uses a two-token hybrid: (1) Short-lived access token (JWT) — 5–15 minute TTL, validated in-process at every service using JWKS public keys cached in memory. No network call required for verification. If the access token is stolen, the attacker has at most 15 minutes of access. (2) Long-lived refresh token — stored server-side in Redis or a database, valid for days or weeks. Used only to exchange for a new access token when the current one expires. The refresh token is stateful and can be instantly revoked. At verification time: the API gateway validates the JWT signature using the cached JWKS public key (in-process, <1ms). For instant revocation before expiry, the JWT's jti claim (unique token ID) is added to a Redis Bloom filter blacklist. A Bloom filter provides probabilistic membership testing — false positives (legitimate tokens incorrectly flagged) are configurable (0.1% false positive rate with appropriate filter size). Refresh token rotation: every time a refresh token is used, it is invalidated and a new one issued. If a token that was already used is presented (replay detection), the entire token family is immediately revoked — indicating the refresh token was stolen after initial issuance.

System Flow

flowchart TD A["Client"] -- "1. Request with JWT" --> B["API Gateway"] B -- "2. Local Crypto Verify" --> B B -- "3. Check Blacklist" --> C[("Redis Bloom Filter")] C -- "4. If Valid" --> D["Backend Service"] A -- "5. Token Expired: Send Refresh" --> E["Auth Service"] E -- "6. Rotate and Issue New Pair" --> A

The API Gateway performs fast in-process JWT verification against a cached blacklist; refresh token rotation happens out-of-band at the Auth Service only when access tokens expire.

Real-World Examples Indicative

Google's Passport token for service identity

Google's internal authentication uses a "Passport" — a short-lived (~10-minute) signed binary proto that carries user identity, service identity (which binary is making the call, at which version), and a chain of custody recording which services have handled the request. Passport propagates through the entire service call chain: Service A's Passport is attached to the call to Service B, which verifies the signature in-process using Google's LOAS (Low Overhead Authentication System) cached public key — zero calls to a central auth service per request. The chain-of-custody property prevents privilege escalation: a compromised low-privilege service cannot forge a Passport claiming to be a high-privilege service, because the cryptographic chain would be invalid.

Auth0 refresh token rotation with reuse detection

Auth0 implements RFC 6749 refresh token rotation with automatic family revocation. Configuration: rotateRefreshTokens: true, reuseInterval: 0. When a refresh token is used: Auth0 issues a new access token (TTL=900s), invalidates the old refresh token, and issues a new refresh token. The old token is marked as consumed in Auth0's token store. If the consumed token is presented again, Auth0 immediately revokes every token in the family tree — all active sessions for that user across all devices are terminated. This detects token theft: if the legitimate client presents the new token and the attacker simultaneously presents the old one, the second presentation triggers family revocation, forcing the legitimate user to re-authenticate (acceptable security tradeoff).

Shopify's 1-minute JWT for embedded apps

Shopify's Admin embedded apps (running in iframes within the Shopify Admin) use JWTs with 60-second TTL via the @shopify/app-bridge SDK. The extremely short TTL means the revocation window is bounded to 60 seconds — at most 60 seconds of access after a token is compromised. The tradeoff: the embedded app calls getSessionToken() on every API request to obtain a fresh token, adding one round-trip to each interaction. Shopify's backend verifies each JWT's iss (issuer), dest (shop domain), aud (client ID), and exp (expiry) — a 5-step validation that runs in-process from the cached HMAC secret without calling Auth0 or any external service.

Anti-Patterns

Long-lived stateless JWTs

Issuing JWTs with 24-hour or 7-day expiry without a revocation mechanism. A stolen token grants access for the full TTL — an attacker who extracts a JWT from a compromised device or XSS payload retains access for days after the user changes their password.

Calling the auth service on every request

Each microservice making a synchronous HTTP call to the auth service to validate every incoming JWT. At 10K RPS across 20 services, this generates 200K auth service calls/second — the auth service becomes the bottleneck for the entire platform. Validate JWTs in-process using cached JWKS public keys.

Sensitive data in JWT payload

Storing user permissions, role assignments, PII, or pricing data in JWT claims. JWTs are base64url-encoded, not encrypted — the payload is trivially readable. Encrypt sensitive claims using JWE (JSON Web Encryption), or include only a user ID and look up permissions at the service layer.

Refresh token without reuse detection

Issuing new refresh tokens on exchange without invalidating the old one. If a refresh token is stolen, the attacker silently maintains access indefinitely — both the legitimate user and attacker hold valid refresh tokens and both receive new access tokens indefinitely.

Design Tradeoffs

DimensionStateless JWTStateful Session
Verification overheadIn-process HMAC or RSA check using cached public key: <1ms, zero network callsRedis or database lookup per request: 1–5ms added latency
RevocationComplex: requires blacklist (Redis Bloom filter) or waiting for TTL expiryInstant: delete session key from Redis; next request fails immediately
Token size200–500 bytes per request (header + claims + signature)32–64 bytes (session ID only); payload lives server-side
Horizontal scalingTrivial: any instance verifies any token using the shared public keyRequires shared Redis cluster reachable from all application instances

Best Practices

Keep access token TTL at 5–15 minutes. This is the revocation window for any compromised token without a blacklist. The refresh token handles session continuity — users do not need to re-authenticate every 15 minutes.
Cache the identity provider's JWKS public keys at the API gateway with a 1-hour TTL and background refresh. Never fetch JWKS on every request — a JWKS endpoint outage would take down token verification for every API call.
Implement refresh token rotation with reuse detection. Use a token family ID: when a replay is detected (consumed token presented again), revoke the entire family. This is the only mechanism that detects refresh token theft after the initial compromise.
Store web client tokens in HttpOnly, Secure, SameSite=Strict cookies — not in localStorage. XSS attacks can read localStorage; they cannot read HttpOnly cookies. Mobile clients should store tokens in the OS secure enclave (iOS Keychain, Android Keystore).
Treat the auth service as a Tier 1 dependency: deploy it multi-zone with a load balancer, set connection timeouts of 50ms at the gateway, and define a fallback behavior (fail closed: deny access if auth is unreachable) with clear runbooks.

When to Use / Avoid

Use WhenAvoid When
Running a distributed microservices architecture where every service independently verifies tokens without calling a central auth serviceBuilding a simple monolithic application where a single Redis session store is operationally straightforward and 1–5ms lookup latency is acceptable
Supporting multiple client types (web, mobile, third-party integrations) with a unified OAuth 2.0 token standardApplication requires absolute instant revocation with zero latency tolerance — stateful sessions guarantee this; JWTs with blacklists add 1–5ms Redis latency
Service teams need to verify user identity independently without sharing a database dependencyTeam lacks the expertise to securely implement JWKS rotation, token family revocation, and secure cookie configuration