Rate Limiting & Throttling

TL;DR

Rate limiting: Limit requests per time window. Algorithms: Token bucket (flexible), leaky bucket (smooth), fixed/sliding window. Use cases: API protection, DDoS prevention, fair usage.

Algorithms

1. Token Bucket

Bucket capacity: 100 tokens
Refill rate: 10 tokens/second

Request arrives → consume 1 token
If tokens available → allow
If empty → reject (429 Too Many Requests)

Pros: Handles bursts (100 requests at once if bucket full)
Cons: Allows bursts

2. Leaky Bucket

Bucket capacity: 100 requests
Process rate: 10 requests/second

Requests queue in bucket
Process at constant rate (smooth traffic)
If full → reject

Pros: Smooth traffic (no bursts)
Cons: Doesn't allow bursts

3. Fixed Window

Window: 1 minute (00:00-01:00)
Limit: 100 requests

Count requests in current minute
Reset counter at 01:00

Pros: Simple
Cons: Burst at window boundary (100 at 00:59, 100 at 01:00 = 200 in 1 second)

4. Sliding Window

Track requests with timestamps
Count requests in last 60 seconds (rolling window)

Pros: Accurate, no bursts
Cons: Memory overhead (store timestamps)

Implementation (Redis)

def rate_limit_sliding_window(user_id, limit=100, window=60):
    now = time.time()
    key = f"rate_limit:{user_id}"
    
    # Remove old entries (outside window)
    redis.zremrangebyscore(key, 0, now - window)
    
    # Count requests in window
    count = redis.zcard(key)
    
    if count >= limit:
        raise RateLimitExceeded()
    
    # Add current request
    redis.zadd(key, {str(uuid.uuid4()): now})
    redis.expire(key, window)

Common Interview Questions

Q1: "How would you implement rate limiting?"

Answer:

Algorithm: Token bucket or sliding window
Storage: Redis (distributed, fast)
Key: user_id:api_endpoint or ip_address
Response: 429 Too Many Requests with Retry-After header

Q2: "Token bucket vs leaky bucket?"

Answer:

Token bucket: Allows bursts (good for UX)
Leaky bucket: Smooth traffic (good for backend protection)
Most common: Token bucket (better user experience)

Q3: "How do you rate limit across multiple servers?"

Answer:

Centralized: Redis stores counters (all servers check Redis)
Distributed: Each server tracks quota, sync periodically (eventual consistency)
Prefer: Centralized with Redis (accurate)

Quick Reference

Algorithms:

Token bucket: Flexible, allows bursts (most common)
Leaky bucket: Smooth, no bursts
Sliding window: Accurate, memory overhead

Implementation: Redis with TTL
Response: 429 + Retry-After header

Next: Geo-Distribution.

TL;DR​

Algorithms​

1. Token Bucket​

2. Leaky Bucket​

3. Fixed Window​

4. Sliding Window​

Implementation (Redis)​

Common Interview Questions​

Q1: "How would you implement rate limiting?"​

Q2: "Token bucket vs leaky bucket?"​

Q3: "How do you rate limit across multiple servers?"​

Quick Reference​

TL;DR

Algorithms

1. Token Bucket

2. Leaky Bucket

3. Fixed Window

4. Sliding Window

Implementation (Redis)

Common Interview Questions

Q1: "How would you implement rate limiting?"

Q2: "Token bucket vs leaky bucket?"

Q3: "How do you rate limit across multiple servers?"

Quick Reference