Blob Storage
TL;DR
Blob storage: Store unstructured data (files, images, videos). S3: Amazon's object storage (99.999999999% durability). CDN: Cache blobs close to users.
Core Concepts
1. Object Storage vs Block Storage
| Type | Use Case | Example |
|---|---|---|
| Object | Files, images, backups | S3, Azure Blob, GCS |
| Block | Databases, VMs (low-level) | EBS, SAN |
| File | Shared file systems | EFS, NFS |
Object storage = Best for web apps (scalable, HTTP access)
2. S3 Architecture
Key features:
- Durability: 99.999999999% (11 nines) - won't lose data
- Availability: 99.99% - can access data
- Scalability: Unlimited storage
- Versioning: Keep old versions of files
- Lifecycle policies: Auto-delete old files
3. Signed URLs
Problem: Don't want S3 buckets public (security risk)
Solution: Generate temporary signed URL
# Server generates signed URL (valid for 1 hour)
signed_url = s3.generate_presigned_url(
'get_object',
Params={'Bucket': 'my-bucket', 'Key': 'photo.jpg'},
ExpiresIn=3600
)
# Client downloads directly from S3 (no server bandwidth)
# https://my-bucket.s3.amazonaws.com/photo.jpg?signature=...
4. Upload Strategies
Small files (less than 5MB): Direct upload to S3
Large files (greater than 100MB): Multipart upload
Benefits: Resume failed uploads, parallel uploads (faster)
Common Interview Questions
Q1: "How would you design a photo sharing service (Instagram)?"
Answer:
- Upload: Client → Server → S3 (original photo)
- Process: Resize to multiple resolutions (thumbnail, medium, large)
- Store: Upload resized photos to S3
- Serve: CDN caches photos, serve from edge locations
- Database: Store metadata (user_id, photo_id, S3 keys)
Q2: "How do you secure S3 buckets?"
Answer:
- Private by default: Block public access
- Signed URLs: Temporary access (expire after 1 hour)
- IAM policies: Control who can access
- Encryption: At rest (AES-256) and in transit (HTTPS)
Q3: "How do you handle millions of files in S3?"
Answer:
- Key design: Use prefixes for parallelism
- Bad:
photo1.jpg,photo2.jpg(sequential) - Good:
ab/cd/ef/photo1.jpg(random hash prefix)
- Bad:
- Reason: S3 partitions by key prefix (better performance)
Quick Reference
S3 features:
- 99.999999999% durability (won't lose data)
- Unlimited storage
- Signed URLs (temporary access)
- Multipart upload (large files)
Use cases: File storage, backups, static websites, data lakes
Next: Notification Systems - Push notifications, email, SMS.