← System Design Backend Architectures
System Design

GraphQL vs REST at Scale

REST APIs offer predictable performance and native HTTP caching but suffer from overfetching and underfetching.

TL;DR
  • REST APIs offer predictable performance and native HTTP caching but suffer from overfetching and underfetching.
  • GraphQL eliminates overfetching by allowing clients to request exact fields, but introduces the catastrophic N+1 database query problem.
  • GraphQL queries are typically uncacheable at the HTTP edge layer because they use POST requests with dynamic payloads.
  • Scale GraphQL safely by implementing strict query depth limiting, cost analysis, and DataLoader batching.

The Problem

As client applications grow complex, REST APIs force developers into a difficult trade-off. They must either build highly specific endpoints to avoid overfetching data (wasting bandwidth), or build generic endpoints that require clients to make multiple sequential roundtrips (underfetching, leading to slow page loads). However, when teams migrate to GraphQL to solve this, they often crash their databases. Naive GraphQL resolvers execute a separate database query for every nested field in a query (the N+1 problem), allowing a single malicious or poorly written client query to bring down the entire database.

Core System Idea

The choice between GraphQL and REST at scale is a trade-off between client flexibility and server-side predictability. REST exposes structured, resource-oriented endpoints (e.g., /users/:id) that map cleanly to database entities and leverage standard HTTP caching (via ETag or Cache-Control headers) at the CDN level.

GraphQL exposes a single endpoint (usually /graphql) and uses a schema definition language to let clients query arbitrary graphs of data. To make GraphQL safe at scale, the server must implement a batching and caching layer (like the DataLoader pattern) to coalesce individual database lookups into single batched queries. Additionally, the gateway must parse and analyze incoming query documents, rejecting queries that exceed safe depth or complexity thresholds before they are executed.

System Flow

flowchart TD Client[Client Query] --> Gateway[GraphQL Gateway] Gateway -- "1. Parse and Analyze" --> Cost[Query Cost Analyzer] Cost -- "2. If Safe: Execute" --> Resolvers[Field Resolvers] Resolvers -- "3. Batch Requests" --> DataLoader[DataLoader Layer] DataLoader -- "4. Single Batched Query" --> DB[(Database)]

The GraphQL Gateway analyzes query complexity and depth before execution, using a DataLoader layer to batch nested database requests and prevent N+1 query issues.

Real-World Examples Indicative

GitHub GraphQL API v4

GitHub uses a point-based cost system where every requested node costs 1 point and every connection costs first or last argument points. The limit is 5,000 points per hour. A query like repositories(first:100) { issues(first:100) { nodes { id } } } costs 10,100 points and is rejected before execution. GitHub returns X-RateLimit-NodeCount and X-RateLimit-Cost headers so clients can tune queries before hitting the limit.

Shopify Storefront API

All production Shopify Storefront API calls use persisted queries—the client sends a SHA-256 hash of a query document registered at build time via POST /graphql/persist. At runtime, Shopify's Fastly CDN caches GET requests by hash as the cache key, achieving ~95% cache hit ratio on standard product page queries. An unregistered query hash returns {"errors": [{"message": "PersistedQueryNotFound"}]}, blocking ad-hoc queries from reaching origin.

Netflix Apollo Federation

Netflix's API gateway runs Apollo Router federating 80+ domain microservices (titles, recommendations, viewing history, billing) into a single unified graph. Each domain service owns its schema fragment and exposes it via @key directive so the router can stitch entity types across service boundaries. DataLoader batches all lookupTitle(ids: [...]) calls for a single request into one upstream call per service, reducing N+1 resolution overhead from hundreds of DB queries to one batched query per domain.

Anti-Patterns

Exposing Raw DB Schemas

Mapping GraphQL types directly to database tables, which leaks internal implementation details and prevents database refactoring.

Unconstrained Query Depth

Allowing clients to execute infinitely nested queries (e.g., user { friends { friends { friends } } }), which quickly exhausts server memory and crashes the process.

Ignoring DataLoader

Writing nested GraphQL resolvers that perform individual database queries per item in a list, resulting in hundreds of database roundtrips for a single HTTP request.

Using GraphQL for Binary Data

Sending large file uploads or binary data through GraphQL mutations, which inflates payloads due to Base64 encoding (use direct S3 pre-signed URLs instead).

Design Tradeoffs

DimensionGraphQLREST
Fetching efficiencyClients specify exact fields; eliminates over-fetching and multi-roundtrip under-fetching in a single requestFixed payloads force over-fetching or multiple sequential roundtrips to assemble composite views
HTTP cachingPOST-based queries bypass CDN cache by default; persisted queries with GET enable edge caching but require build-time registrationNative HTTP caching via Cache-Control and ETag; CDN caches full responses without any extra infrastructure
Server complexityQuery parsing, cost analysis, DataLoader batching, and schema federation required at the gateway layerPredictable execution per endpoint; standard routing with no query planning or batching overhead

Best Practices

Implement DataLoaderAlways use DataLoader (or equivalent batching libraries) in your resolvers to batch and cache database reads within a single request lifecycle.
Enforce Query Depth and Cost LimitsSet a maximum query depth (e.g., max 5 levels) and calculate query cost dynamically, rejecting expensive queries with an HTTP 400 before execution.
Use Persisted QueriesHave clients register their GraphQL queries at build time, allowing them to send a SHA-256 hash at runtime. This enables CDN caching of GET requests and blocks ad-hoc introspection in production.
Version via Schema EvolutionAvoid versioning URLs (e.g., /v1 vs /v2). Instead, evolve GraphQL schemas by deprecating fields gradually and adding new fields incrementally.
Monitor Resolver LatencyTrack execution times of individual field resolvers to identify which database queries or downstream services are causing tail latency.

When to Use / Avoid

Use WhenAvoid When
You have highly diverse clients (web, iOS, Android, IoT) requiring different data shapes from the same backend.You are building simple CRUD APIs with predictable, uniform data access patterns.
You are building a developer platform or public API where clients need to self-serve complex data relationships.Your system has ultra-low latency requirements and cannot tolerate the overhead of query parsing and validation.
You have a federated microservices architecture where a single gateway needs to stitch multiple domain graphs together.Your team lacks the operational capacity to monitor resolver performance, implement batching, and secure GraphQL endpoints.