Skip to main content

Data Replication

TL;DR

Replication: Copy data to multiple nodes. Master-slave: One write node, many read nodes. Multi-master: Multiple write nodes (conflict resolution needed). Benefits: High availability, read scaling.

Replication Strategies

1. Master-Slave (Single-Leader)

Pros: Simple, consistent writes (one leader)
Cons: Write bottleneck (all writes to master)

2. Multi-Master (Multi-Leader)

Pros: Low latency writes (write to nearest master)
Cons: Conflict resolution (two users update same record)

Conflict resolution:

  • Last-write-wins (based on timestamp)
  • Application logic (merge conflicts manually)
  • CRDT (Conflict-free replicated data types)

3. Quorum Consensus

Reads + Writes > Nodes ensures consistency.

Example (3 nodes):

  • Write to 2 nodes (W=2)
  • Read from 2 nodes (R=2)
  • R + W > N (2 + 2 > 3) ✅

Guarantees: Reads always see latest write (overlap).

Common Interview Questions

Q1: "Master-slave vs multi-master replication?"

Answer:

  • Master-slave: One write node (simple, no conflicts)
  • Multi-master: Multiple write nodes (low latency, but conflicts)
  • Use master-slave for: Most apps (simpler)
  • Use multi-master for: Multi-region writes (accept complexity)

Q2: "How do you handle replication lag?"

Answer:

  1. Read from master: For critical reads (sacrifice scalability)
  2. Session consistency: User reads from same replica
  3. Eventual consistency: Accept stale reads (most common)

Q3: "What is a quorum?"

Answer:

  • Quorum: Minimum nodes needed to agree
  • Formula: R + W > N (read + write quorum > total nodes)
  • Example: 5 nodes, W=3, R=3 (3+3>5) - guarantees consistency

Quick Reference

Master-slave: One leader, many followers (simple)
Multi-master: Multiple leaders (low latency, conflicts)
Quorum: R + W > N (consistency guarantee)


Next: Distributed Transactions.