Blob and Object Storage Architecture
Decouple application servers from heavy file transfers by utilizing presigned URLs for direct client uploads.
- Decouple application servers from heavy file transfers by utilizing presigned URLs for direct client uploads.
- Achieve high-throughput parallel transfers for large files using multipart upload protocols.
- Optimize storage costs automatically by configuring lifecycle rules to transition data to colder tiers.
- Mitigate read-after-write consistency issues by designing applications to handle eventual consistency in secondary indexes.
The Problem
Storing large binary files (images, videos, backups) directly inside relational databases degrades transaction performance, bloats backups, and exhausts expensive SSD storage.
Conversely, storing files on local application server disks prevents horizontal scaling, as files are trapped on a single machine.
Furthermore, proxying large file uploads through application servers chokes network bandwidth, increases memory usage, and blocks execution threads, leading to slow response times and frequent timeouts.
Core System Idea
Object storage architectures decouple metadata (stored in high-performance databases) from the physical binary data (stored in a distributed flat namespace). Files are treated as immutable "objects" identified by unique keys within a "bucket."
To scale uploads and downloads without overloading application servers, the system uses "Presigned URLs." The application server authenticates the user, validates the request, and generates a time-limited, cryptographically signed URL. The client then uploads or downloads the binary file directly to/from the object storage service using this URL.
For large files, the architecture utilizes "Multipart Uploads." The file is split into independent chunks, uploaded in parallel, and reassembled by the object storage engine.
To manage costs, the system relies on "Storage Class Tiering," automatically moving objects from high-performance SSDs (Hot) to cheaper HDDs (Warm/Cool) and tape-like archival storage (Cold/Glacier) based on access frequency and age.
System Flow
Direct-to-client upload flow using presigned URLs to bypass application server network bottlenecks.
Real-World Examples Indicative
Dropbox built Magic Pocket in 2016 to replace S3, saving ~$75M/year in storage costs. Magic Pocket splits files into 4MB content-addressed blocks (SHA-256 hash as key). Identical content blocks — common across Dropbox's 500B+ file corpus (duplicate documents, shared spreadsheets, identical profile photos) — are deduplicated at the block level: the same physical block is stored once regardless of how many users have that file. The deduplication rate across blocks is ~30%, meaning only ~70% of blocks require unique physical storage. Client uploads use presigned internal tokens: the API server validates metadata and returns an upload token; the client streams blocks directly to Magic Pocket storage nodes without touching application servers.
Cloudflare R2 stores 10B+ objects for customers migrating from S3 to avoid $0.09/GB egress charges. R2 supports multipart uploads with a 5MB minimum part size and up to 10,000 parts per object. R2's event notifications publish to Cloudflare Workers within 250ms of object creation — workers trigger transcoding pipelines via Cloudflare Queues. Presigned URL TTL is configurable from 1 second to 7 days: Figma uses 60-second TTLs for sensitive exported design assets, while public CDN-served assets use 24-hour TTLs to maximize edge cache hit rates.
Figma stores design files as delta-compressed S3 objects. Large enterprise exports (up to 2GB) use S3 multipart upload: the backend issues CreateMultipartUpload, splits the file into 50MB chunks, and uploads each chunk across 5 concurrent HTTP connections in parallel. This reduces a 2GB sequential upload from ~4 minutes to ~50 seconds. A bucket lifecycle rule automatically aborts incomplete multipart uploads after 7 days, preventing orphaned fragment storage charges from clients that disconnect mid-upload.
Anti-Patterns
Reading file streams into application memory before writing them to object storage wastes CPU, RAM, and network bandwidth.
Attempting to perform frequent append operations or file locks on object storage is highly inefficient, as objects are immutable and must be completely rewritten on every update.
Failing to restrict bucket permissions or relying on security-by-obscurity for object URLs leads to catastrophic data leaks.
Failing to configure lifecycle rules to clean up incomplete multipart uploads results in hidden storage charges for orphaned file fragments.
Design Tradeoffs
| Dimension | Direct Upload (Presigned URLs) | Server-Proxied Upload |
|---|---|---|
| Application server load | Zero load; clients stream binary data directly to object storage, bypassing application server memory and bandwidth entirely | High load; servers must buffer the full file stream in memory and forward it, consuming RAM and network I/O per upload |
| Pre-storage validation | Hard; virus scanning, image resizing, or format validation must happen asynchronously after upload via event triggers | Easy; application servers can inspect, transform, or reject files synchronously before writing them to storage |
| Client implementation complexity | Requires CORS configuration, presigned URL generation logic, and client-side multipart retry handling | Simple; standard multipart form submission with no client-side URL management or CORS concerns |
Best Practices
When to Use / Avoid
| Use When | Avoid When |
|---|---|
| Storing immutable files larger than a few kilobytes (e.g., media assets, documents, database backups). | Storing highly dynamic data that requires frequent in-place updates or appends. |
| Building globally accessible, highly durable data lakes for analytical processing. | Low-latency file system operations (e.g., running database storage engines or code execution). |
| Distributing static web assets (HTML, JS, CSS, images) at scale via CDNs. | Storing highly sensitive, transient data that must never leave local memory. |