PACELC Theorem & Distributed Databases

4 min readDec 23, 2021

In this article we are going to cover PACELC Theorem which is an extension for CAP Theorem and see how how distributed databases have limitations and tradeoffs regarding Network Partition, Consistency, Availability and Latency.

What is P.A.C.E.L.C. ?

Consistency

Consistency means that all clients see the same data at the same time, regardless of node they are connected to.

Availability

Availability means that clients get successful response, even if nodes are down in cluster. Though the response maybe inconsistent with previous write.

Partition Tolerance

Partition Tolerance means that in case there is communication break within cluster (Broken Network, Delayed message transmission) The cluster is able to operate regardless of where partition happens.

Latency

Latency is the time taken when client initiated the request till it receives response which includes both processing time and network transmission delay.

PACELC Theorem States That :

CAP Theorem Trade-Offs

Lets apply CAP Theorem in Master Slave Replication cluster setup for simplicity. Also, let’s imagine that network partition happens at master node in a way that it cannot communicate with rest of the nodes in cluster and with clients or master node goes down. Let’s visualize this in diagram below.

We can also see that CAP theorem is applicable for other scenarios including but not limited to:
* Multi Master Replication cluster setup
* Cluster with more than three nodes in Master Slave Replication
* Sharded databases like Cassandra/Dynamodb where Key Range is owned by more than one node.
* Network Partitioning happening across clients and cluster nodes where they are isolated from others.

Consistency and Availability Trade-Off

As we can see that when master node is down, Client 1 has few options to retrieve information

Consistency: To Achieve consistency clients can:

Connect to master and the request fails.
It waits/retires till master node is up and running, which again violates availability.

Availability: Client can connect to other replica and access stale or inconsistent information.

This is the trade-off which client needs to make and as described through CAP Theorem, client can either choose Consistency or Availability in case of Network Partitions.

PACELC as Extension to CAP Theorem

Again, Let’s take previous example with Master Slave Cluster Setup. This time lets assume that the Network Partition doesn’t happen and master node is up and running. But after certain point in time its is overwhelmed by clients request and its performance degrades.

The Latency Consistency Trade-Off

Client 1 has few following options to retrieve information and this is the trade-off which client needs to make and as described through PACELC Theorem.

Consistency: To achieve consistent information client still requests Master node but receives slow response or high latency.

Latency: Client can connect to other replica and accesses stale or inconsistent information and receives response with low latency.

Off-course, PACELC Theorem is applicable to different scenarios where clients or nodes within cluster need to make a tradeoff between latency and consistency when Network Partition doesn’t happens.

Applying PACELC Theorem to Databases

The default versions of DynamoDB, Cassandra, and Cosmos DB are PA/EL systems: if a partition occurs, they give up consistency for availability, and under normal operation they give up consistency for lower latency.

MongoDB can be classified as a PA/EC system. In the baseline case, the system guarantees reads and writes to be consistent.

Fully ACID systems such as VoltDB/H-Store, Megastore and MySQL Cluster are PA/EC: they refuse to give up consistency, and will pay the availability and latency costs to achieve it. BigTable and related systems such as HBase are also PC/EC.

Databases these days offer a way to configure these tradeoffs between Availability, Consistency and Lower Latency.

Conclusion

“Ignoring the consistency/latency trade-off of replicated systems is a major oversight [in CAP]” is one important aspect addressed by PACELC Theorem in distributed database systems.

Daniel J. Abadi first described PACELC theorem around 2010 in his blog and then in paper in 2012. Theorems help us carefully design platform and analyse tradeoff which everyone needs to make while architecting distributed solutions to solve real world problems.

Coming Up Next Part-2

Hope you enjoyed the article. Stay tuned for the Part-2 and any suggestions are most welcome.