CAP Theorem with Example

CAP Theorem with Example hero pic

Introduction

In this article, I will try to explain the CAP theorem with example in simple terms.

CAP theorem states that in distributed systems, you can only achieve two out of three guarantees:

  • Consistency: All nodes have the same data at any given time.
  • Availability: Every node remains accessible for requests.
  • Partition tolerance: The system functions even when communication between nodes is disrupted.

So if we want to make a distributed system, we will end up compromising on either Consistency or Availability.

Let us understand it with a CAP theorem with example of a real-world analogy.

Real-world Analogy

Chapter 1: You start a travel business

You run “Treasure Trove Tourism,” a travel agency renowned for its personalized adventure recommendations. Your clients rave about your uncanny ability to match their unique interests with hidden gems around the world. But as your business booms, handling the influx of requests becomes overwhelming.

Chapter 2: Scaling Up – Multiple Guides, Growing Pains

To manage the surge, you hire a team of travel experts, each specializing in different regions. You envision a seamless experience where any guide can access and update client preferences, ensuring personalized recommendations no matter who takes the call.

Chapter 3: The “Lost in Translation” Incident

Disaster strikes! John, a client seeking a secluded beach getaway in Thailand, speaks with Sarah. She recommends a hidden cove based on his preferences. Later, John calls David, seeking adventure in Nepal. David, unaware of John’s Thailand plans, suggests a thrilling Himalayan trek. John is confused! Your once-seamless system has delivered conflicting recommendations. What went wrong?

Chapter 4: The Consistency Conundrum

You realize the issue: consistency. Without all guides having the latest client data, recommendations can contradict each other. You brainstorm solutions:

  • Real-time updates: Every update triggers immediate communication to all guides, ensuring everyone has the latest information. This guarantees consistency but can overwhelm guides with constant updates and potentially delay responses (availability hit).
  • Offline updates: Guides update their local copies periodically, minimizing disruptions. However, clients might receive outdated information until the next sync (consistency hit).

Chapter 5: Embracing Partition Tolerance – The Backup Backpack

You consider another challenge: what if a network outage severs communication between guides? You introduce “backup backpacks,” physical copies of client data each guide carries. Even offline, recommendations can be made, ensuring availability. However, any updates made during the outage won’t be reflected in other guides’ backpacks until reconnected (consistency hit).

Chapter 6: The CAP Theorem Unveiled

This scenario exemplifies the CAP theorem: in distributed systems, you can only achieve two out of three guarantees:

  • Consistency: All nodes have the same data at any given time.
  • Availability: Every node remains accessible for requests.
  • Partition tolerance: The system functions even when communication between nodes is disrupted.

Treasure Trove Tourism must choose:

  • Consistency and availability: Real-time updates ensure consistency but might impact availability.
  • Consistency and partition tolerance: Offline updates ensure consistency and offline access but risk outdated information.
  • Availability and partition tolerance: Backup backpacks maintain access but sacrifice consistency during outages.

Chapter 7: Finding Your Balance

The ideal choice depends on your specific needs. For Treasure Trove Tourism, prioritizing consistency and availability might be best, as outdated recommendations could be disastrous. However, if brief outages are acceptable, prioritizing availability and partition tolerance might be more practical.

Databases priorities

We can classify databases in either one of the 3 categories but each of them can be fine-tuned to solve the user’s preferences.

CA (Consistency and Availability): MySQL, Postgres, Redis, Hazelcase, etc

AP (Availability and Partition Tolerance): Cassandra

CP (Consistency and Partition Tolerance): MongoDB, etc

Conclusion

The CAP theorem is not a binary choice. You can implement mechanisms to improve consistency even with partitions or prioritize availability while mitigating eventual inconsistencies. The optimal approach depends on your specific needs and priorities.

PS: Took inspiration from this article

You can find more blogs here.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *