← System Design Backend Architectures
System Design

Microservices vs Monolith

Monoliths minimize operational complexity and network latency by keeping the entire application within a single process; microservices solve organizational scaling problems at the cost of distributed transactions, network overhead, and operational complexity that is fatal to small teams.

TL;DR
  • Monoliths minimize operational complexity by keeping the entire application within a single process with ACID transactions and in-process function calls. Microservices solve organizational scaling problems — independent deployment, team autonomy — not technical ones.
  • Shopify ran a Rails monolith for 15+ years at 80K+ requests/second on Black Friday. Their secret: Component-Based Rails Applications (CBRA) — enforced module boundaries within a single codebase without microservices overhead.
  • Segment migrated 140+ microservices back to a single Go service, cutting hosting costs by 60% and reducing P99 pipeline latency from 3+ seconds to under 100ms. Microservices that only pass data sequentially with no independent scaling requirements are better as one service.
  • Never decompose before the domain boundaries are fully understood. Refactoring a wrong service boundary means changing an API contract across a network — far more expensive than changing a function signature within a monolith.
  • Use the Strangler Fig pattern for incremental migration: an edge router intercepts all traffic, progressively redirecting specific paths to new services while the monolith handles the rest. Never rewrite from scratch.

The Problem

A five-engineer startup builds a monolith. Two years later, 40 engineers are working in the same codebase — a change to the inventory module breaks the payment module because they share a database transaction boundary nobody documented. Deployments require coordinating all four teams simultaneously because a failed migration in one module rolls back everyone. Leadership decides to adopt microservices. Eighteen months later, the engineering team has 30 services, a custom service discovery system, 15 Kafka topics nobody fully understands, and P99 latency that increased from 50ms to 800ms because a page renders requires 8 sequential service calls. Both paths failed — the monolith for organizational reasons, the microservices for architectural ones.

Core System Idea

The monolith-vs-microservices decision is primarily an organizational scaling decision, not a technical one. A monolith consolidates all execution within a single process: function calls are microseconds, database queries are ACID, debugging is a single log stream, and CI/CD is one pipeline. The cost: every team deploys together, and resource-heavy components (ML inference, image processing) scale the entire application even when 95% of the codebase is idle. Microservices decompose the system into autonomous services with independent databases and deployment pipelines. The benefits are organizational: teams deploy without coordination, services scale independently, and a failure in the recommendations service cannot crash checkout. The costs are severe: every cross-service call is a network hop (0.5–5ms each), transactions across services require the Saga pattern with compensating actions instead of ROLLBACK, and debugging requires distributed tracing. The Strangler Fig pattern bridges the two: place a routing layer in front of the monolith, then progressively redirect specific domain paths to new services. Route /orders/** to the new Order Service while everything else continues hitting the monolith. Each carved-out service owns its own database — sharing a database across services recreates the monolith's coupling at the schema level.

System Flow

flowchart TD A["Client Request"] --> B["Strangler Router"] B -- "Legacy Path" --> C["Monolithic Core"] B -- "Carved Path" --> D["API Gateway"] D --> E["Order Service"] D --> F["Payment Service"] C --> G[("Monolith DB")] E --> H[("Order DB")] F --> I[("Payment DB")]

The Strangler Fig pattern uses an edge router to incrementally divert specific domain traffic to isolated microservices while the monolith handles the remaining paths.

Real-World Examples Indicative

Shopify's modular monolith at 80K RPS

Shopify operated a Ruby on Rails monolith for over 15 years, processing Black Friday at 80K+ requests/second without decomposing into microservices. Their technique: Component-Based Rails Applications (CBRA) — the monolith is partitioned into namespaced Ruby gems with enforced component boundaries. A CI tool rejects any pull request that introduces a direct Ruby require across component boundaries; components must interact through public API modules. This enforces the isolation benefits of microservices (independent domain ownership, restricted coupling) within a single deployable process. Shopify eventually migrated select high-volume subsystems (their product search, served by Elasticsearch) but kept the core commerce logic in the modular monolith through their peak growth years.

Segment's migration from 140 microservices back to a monolith

Segment built a microservices pipeline after 2015: 140 services routing customer event data from sources to destinations (Salesforce, Mixpanel, etc.). By 2017: 750+ servers, dozens of Kafka topics, and tracing any data delivery bug required following events across 6+ services with no single dashboard. They consolidated the event-processing pipeline into a single Go service called Centrifuge — 140 microservices became 1 service. Results: hosting costs dropped 60%, P99 pipeline latency fell from 3+ seconds to under 100ms, and debugging reverted to reading a single service's logs. The lesson: microservices whose primary function is forwarding data sequentially with no independent scaling need are a distributed monolith with extra network hops.

Netflix's decomposition from DVD monolith

Netflix began decomposing their monolith in 2009 after a three-day database corruption incident caused by a botched schema migration on their shared Oracle database. By 2015: 500+ microservices, 1,000+ deploys/day, each team owning their service end-to-end. The decomposition criteria that drove each split: (1) independent scaling profile — streaming (read-heavy, bandwidth-bound) vs. recommendations (compute-heavy, CPU-bound) vs. billing (low-volume, consistency-critical) required fundamentally different hardware; (2) team autonomy — 2,000 engineers deploying to one codebase was the organizational bottleneck; (3) fault isolation — a single database failure had taken down all capabilities simultaneously. Netflix's approach validates the rule: microservices solve organizational problems at scale, not latency or complexity problems for small teams.

Anti-Patterns

The distributed monolith

Decomposing services but keeping them tightly coupled via synchronous HTTP/gRPC call chains, where Service A calls Service B calls Service C, and a failure at C propagates to all callers. This produces the worst of both worlds: high network latency, cascading failures, and lockstep deployments because every service must be deployed in the correct order.

Shared database across microservices

Allowing multiple services to read and write the same database tables directly. This leaks implementation details across service boundaries, makes schema migrations require coordination across all services, and recreates the coupling the decomposition was supposed to eliminate.

Premature decomposition

Splitting services before domain boundaries are fully understood. Refactoring a wrong service boundary requires changing an API contract over a network — far more expensive than moving a function between modules in a monolith.

Nano-services

Breaking down to the level of individual functions or domain entities. An Order service that makes 5 synchronous calls to Item service, Pricing service, Inventory service, Tax service, and Shipping service adds 2.5–25ms of network overhead per checkout, plus 5× the failure surface area.

Design Tradeoffs

DimensionMonolithMicroservices
Transaction modelACID: single database, cross-entity transactions with ROLLBACKDistributed: Saga pattern with compensating actions; eventual consistency
Operational overheadLow: one deploy pipeline, one log stream, one service to debugHigh: service discovery, distributed tracing, per-service CI/CD and on-call
Call latencyIn-process function calls: microsecondsNetwork calls: 0.5–5ms per hop; 8-hop render = 4–40ms overhead
Scaling granularityMust scale the entire application when one component is the bottleneckCan scale compute-heavy services (ML inference) independently of low-volume ones (billing)

Best Practices

Define service boundaries by transactional consistency requirements, not team structure or code volume. If two domain entities must be updated with ACID consistency, they belong in the same service and database. Crossing that boundary with a distributed transaction is always more expensive.
Use the Strangler Fig pattern for migration: route specific paths to new services while the monolith handles the rest. Never do a full rewrite — each carved service should go to production before the next is started.
Deploy distributed tracing (OpenTelemetry + Jaeger) before splitting the first service. Debugging a request that spans 5 services without trace IDs is nearly impossible under incident pressure.
Apply Conway's Law deliberately: the service architecture will eventually mirror your team structure. Design service boundaries to match the team boundaries you want, not the ones you have today.
Size services by the "two-pizza team" rule (Amazon's heuristic: 8–10 engineers per service) and by independent deployment frequency. If two services are always deployed together, they should be one service.

When to Use / Avoid

Use WhenAvoid When
Multiple teams need to deploy independently without coordination — the primary organizational forcing functionEngineering team is under 15–20 engineers where communication overhead is low and coordination is cheap
Components have radically different resource profiles that require independent scaling (CPU-heavy ML vs. I/O-heavy API)Domain model is highly fluid — service boundaries drawn before the domain is understood will be wrong and expensive to fix
Fault isolation is critical — a failure in recommendations must not be able to crash checkoutStrict in-process performance is required — 6+ network hops add latency that in-process calls don't