Database Transaction Models and the CAP Theorem

By | November 28, 2024

According to Brewer’s theorem (also called CAP theorem), any distributed data store (like a database) can only provide two of the following three guarantees:

  1. Consistency
  2. Availability
  3. Partition tolerance

Now when it comes to pain tolerance, like a network partition failure, one of the following decisions need to be made:

  1. Cancel the operation => decrease the availability but ensure consistency.
  2. Proceed with the operation => provide availability but risk inconsistency.

This is where the database transaction models- ACID and BASE come in.SQL (Relational Database) databases are designed based on the ACID properties wherein they choose consistency over availability (case a).On the other hand, NoSQL (Non-relational or Distributed Database) databases are designed around the BASE property, choosing availability over consistency (case b).

First, let’s look at ACID. The acronym ACID stands for:

  1. Atomic – All operations in a transaction either occur fully or it doesn’t occur at all (rollback).
  2. Consistent – Upon transaction completion, the structural integrity of the data should not be compromised. This essentially means- any given database transaction must change affected data only in allowed ways.
  3. Isolated – Transactions cannot compromise the integrity of other transactions occurring simultaneously, by interacting with them. This is the mechanism used to ensure that there are no uncommitted dependencies (dirty reads).
  4. Durable – This guarantees that once a transaction is complete, it will be committed even if the system fails.

With the above points, we can say that ACID prioritizes database integrity which refers to the overall accuracy, completeness and consistency of data, even if it comes with the cost of sacrificing the availability at all times.The ACID consistency model is used in cases where a database system is required to handle many small simultaneous transactions.ex) financial institutions, data warehousing.

Now coming to BASE,The acronym BASE stands for:

  • Basically Available – Guarantees the availability of data by spreading and replicating it across the nodes of the database cluster. There will always be a response to a request (the response can be a failure too).
  • Soft state – The state of the system isn’t rigid. It can change over time. So all the data being stored doesn’t have to be write-consistent, nor do the different replicas have to be mutually consistent at all times.
  • Eventually consistent – Both the above points show that the BASE model doesn’t enforce immediate consistency. That however doesn’t mean that it never achieves it. So it eventually does becomes consistent.

With the BASE model, one sacrifices consistency, completeness, and accuracy to gain benefits, such as availability, scale, and resilience.The BASE consistency model is used in cases where a database system needs to have huge amounts of data available at all times.ex) Social media applications (Facebook), Streaming Services (Netflix, Spotify), Cab services (Uber).