Data Replication
TL;DR
Replication: Copy data to multiple nodes. Master-slave: One write node, many read nodes. Multi-master: Multiple write nodes (conflict resolution needed). Benefits: High availability, read scaling.
Replication Strategies
1. Master-Slave (Single-Leader)
Pros: Simple, consistent writes (one leader)
Cons: Write bottleneck (all writes to master)
2. Multi-Master (Multi-Leader)
Pros: Low latency writes (write to nearest master)
Cons: Conflict resolution (two users update same record)
Conflict resolution:
- Last-write-wins (based on timestamp)
- Application logic (merge conflicts manually)
- CRDT (Conflict-free replicated data types)
3. Quorum Consensus
Reads + Writes > Nodes ensures consistency.
Example (3 nodes):
- Write to 2 nodes (W=2)
- Read from 2 nodes (R=2)
- R + W > N (2 + 2 > 3) ✅
Guarantees: Reads always see latest write (overlap).
Common Interview Questions
Q1: "Master-slave vs multi-master replication?"
Answer:
- Master-slave: One write node (simple, no conflicts)
- Multi-master: Multiple write nodes (low latency, but conflicts)
- Use master-slave for: Most apps (simpler)
- Use multi-master for: Multi-region writes (accept complexity)
Q2: "How do you handle replication lag?"
Answer:
- Read from master: For critical reads (sacrifice scalability)
- Session consistency: User reads from same replica
- Eventual consistency: Accept stale reads (most common)
Q3: "What is a quorum?"
Answer:
- Quorum: Minimum nodes needed to agree
- Formula: R + W > N (read + write quorum > total nodes)
- Example: 5 nodes, W=3, R=3 (3+3>5) - guarantees consistency
Quick Reference
Master-slave: One leader, many followers (simple)
Multi-master: Multiple leaders (low latency, conflicts)
Quorum: R + W > N (consistency guarantee)
Next: Distributed Transactions.