Zero Trust Architecture
Zero Trust replaces perimeter-based security with a "never trust, always verify" model for every request.
- Zero Trust replaces perimeter-based security with a "never trust, always verify" model for every request.
- Mutual TLS (mTLS) secures inter-service communication, verifying both client and server identities.
- Policy Enforcement Points (PEPs) must be decoupled from application code to ensure consistent authorization checks.
- Automated certificate rotation and dynamic identity management are critical to managing the operational overhead of Zero Trust.
The Problem
Traditional security relies on a "castle-and-moat" model: once a request passes the external firewall, everything inside the internal network is trusted. If an attacker breaches the perimeter—via a compromised dependency, a phishing attack, or an SSRF vulnerability—they gain unrestricted access to the entire internal network. They can move laterally, query database instances, and call internal APIs completely unchecked, resulting in massive data breaches.
Core System Idea
Zero Trust Architecture operates on the assumption that the internal network is hostile. No user, device, or service is trusted by default, regardless of its physical or logical location. Every single request must be authenticated, authorized, and encrypted before access is granted.
At the service level, this is achieved using Mutual TLS (mTLS). Unlike standard TLS where only the client verifies the server, mTLS requires both the client and server to present cryptographically signed certificates to verify each other's identity.
Authorization is handled by separating the Policy Decision Point (PDP) from the Policy Enforcement Point (PEP). The application or its sidecar proxy (PEP) intercepts the request, queries a centralized engine (PDP) to check if Service A is allowed to call Service B with specific parameters, and enforces the decision.
System Flow
Sidecar proxies intercept inter-service traffic, establish an encrypted mTLS tunnel, and validate authorization decisions against a Policy Decision Point.
Real-World Examples Indicative
Every Google inter-service RPC carries a LOAS (Low Overhead Authentication System) token—a 10-minute cryptographically signed credential representing the calling service's workload identity. The rpc-security sidecar validates the token against a policy table before forwarding the call. Even traffic on Google's internal fiber backbone requires mTLS, and chain-of-custody headers in each RPC capture privilege-escalation paths for audit. BeyondProd eliminated the concept of a "trusted internal zone" entirely—no service is granted access by network location alone.
In 2019, Cloudflare replaced its corporate VPN for 1,500+ employees using its own Access product. Every request to an internal application is checked against Okta SSO identity, Cloudflare Gateway device posture (OS version, disk encryption status), and geolocation. Authenticated requests carry a signed CF-Access-JWT-Assertion header that internal apps verify locally—no VPN tunnel is required. Remote access is ~25% faster than the previous VPN due to Anycast routing to the nearest Cloudflare PoP rather than backhauling traffic to a central VPN server.
Vault's database secrets engine issues PostgreSQL credentials with a 1-hour TTL. When the lease expires, Vault revokes the credentials at the database level via REVOKE. The PKI secrets engine issues TLS certificates with 24-hour TTLs, rotated automatically by the vault agent sidecar. Applications never store long-lived credentials—the sidecar injects short-lived secrets into environment variables at startup and handles renewal transparently, with revocation propagating to all consumers within seconds.
Anti-Patterns
Restricting access to internal services based on IP addresses or CIDR blocks, which are easily spoofed and highly fragile in dynamic, auto-scaling cloud environments.
Attempting to manage mTLS certificates manually or setting long expiration times (e.g., 1 year), which makes revocation impossible and increases the risk of credential leaks.
Storing API keys, database passwords, or private keys in source code or configuration files instead of using dynamic, short-lived credentials from a secrets engine.
Securing the external API Gateway but leaving internal databases and message queues completely unauthenticated and unencrypted.
Design Tradeoffs
| Dimension | Zero Trust (mTLS + Policy Engine) | Perimeter Security (VPN + Firewall) |
|---|---|---|
| Blast radius | Compromised service is isolated; workload-scoped mTLS certificates prevent lateral movement to other services | Single perimeter breach exposes the entire internal network; attackers move freely between services |
| Latency overhead | 1-5ms added per RPC for mTLS handshake and policy evaluation (policy results are cached); ~0.5ms ongoing | Near-zero internal latency; trusted packets are routed directly without per-request authentication checks |
| Operational cost | Requires automated CA (SPIRE), policy engine (OPA), and certificate rotation pipelines across every service | Simple firewall rules and a VPN gateway; standard networking tooling suffices |
Best Practices
GET on /users/:id of Service B, and nothing else.When to Use / Avoid
| Use When | Avoid When |
|---|---|
| You operate in highly regulated industries (finance, healthcare) with strict compliance requirements. | You are a seed-stage startup focused on rapid prototyping and finding product-market fit. |
| You run a highly distributed microservices architecture across multiple cloud providers or hybrid environments. | You run a simple monolithic application inside a single, secure VPC with minimal external integrations. |
| You have a large, remote workforce accessing internal systems from various devices and networks. | Your application has ultra-low latency requirements (e.g., high-frequency trading) where cryptographic overhead is unacceptable. |