How does newsql keep ACID

Business Informatics Wiki - Kewee

The CAP theorem was introduced by Eric Brewer in 2000 (hence sometimes Brewer's theorem named) and proved by Seth Gilbert and Nancy Lynch in 2002 (Gilbert / Lynch 2002).

CAP stands for the properties:


Consistency (Consistency)

After a write access, all distributed replicas on all nodes always have the same, current data status, as if this access were an atomic operation on a single server instance (Gilbert / Lynch 2002: p. 3). If the database is queried immediately after such an access, it either delivers the updated data or a new status if another update has taken place in the meantime. At no time, however, an older data status.


Availability (Availability)

The system must be fully available at all times in order to be able to serve requests as quickly and efficiently as possible. Every request to an intact node must be answered (Gilbert / Lynch 2002: p. 3). Response time plays an important role, especially with Internet applications. Longer waiting times (from one second) can be a reason for (potential) customers to leave a web shop or switch to a competitor (Borland 2013). This becomes particularly critical if the system does not respond to a payment confirmation.


Partition tolerance (Failure tolerance)

Even if nodes or communication links should fail (planned or unplanned) and the cluster is divided into partitions, the entire system must continue to run and be ready for use and can redirect tasks in an emergency. To make this possible, the data is replicated in the cluster and kept redundantly on different nodes. "Except for a total network failure, no errors should lead to a system reacting improperly" (translated from Gilbert / Lynch 2002: p. 4).

(Graphic source: Lee 2011)

The theorem states that in a distributed system with replications only two of the three CAP properties can be achieved (cf. Brewer 2012). This means that a system fulfills either the characteristics CA, AP or CP.

A classic RDBMS with its ACID transactions relies primarily on consistency; it is a CA system. If it is a one-server system, it is available and consistent, but not failure-tolerant. A single server cannot be broken down into partitions and is either available or fails altogether. If this system is scaled with replications, the availability suffers as it takes longer for all data to be updated. This becomes more critical the larger such a cluster becomes.

NoSQL systems, on the other hand, tend to rely on constant availability, for which failure tolerance in the cluster is important. So they are more like AP systems that dispense with transactions and build on the less strict BASE consistency model.

CP systems are of particular interest in finance, where consistency is very important for money transfers. An amount that has been withdrawn / deposited at an ATM, for example, must also be debited / credited to the customer account, as otherwise the operating bank and / or the customer will lose money. Network errors or node failures must also be able to be compensated for. However, if in doubt, the service is not available.