JEPSEN

Causal Consistency

Causal consistency captures the notion that causally-related operations should appear in the same order on all processes—though processes may disagree about the order of causally independent operations.

For example, consider a single object representing a chat between three people, where Attiya asks “shall we have lunch?”, and Barbarella & Cyrus respond with “yes”, and “no”, respectively. Causal consistency allows Attiya to observe “lunch?”, “yes”, “no”; and Barbarella to observe “lunch?”, “no”, “yes”. However, no participant ever observes “yes” or “no” prior to the question “lunch?”.

Convergent causal systems require that the values of objects in the system converge to identical values, once the same operations are visible. In such a system, users could transiently observe “lunch”, “yes”; and “lunch”, “no”—but everyone would eventually agree on (to pick an arbitrary order) “lunch”, “yes”, “no”.

Causal consistency is sticky available: even in the presence of network partitions, every client connected to a non-faulty node can make progress. However, clients must stick to the same server.

A slightly stronger version of causal consistency, Real-Time Causal, is proven to be the strongest consistency model in an always-available, one-way convergent system. Most “causally consistent” systems actually provide these stronger properties, such as RTC or causal+.

When a total order is required, and you’re willing to sacrifice availability (and latency), consider sequential consistency. If you need total availability, you’ll have to give up causal (and read-your-writes), but can still obtain writes follow reads, monotonic reads, and monotonic writes.

Formally

Causal memory stems from Lamport’s definition of the happens-before relation, which captures the notion of potential causality by relating an operation to previous operations by the same process, and to operations on other processes whose effects could have been visible thanks to messages exchanged between those processes.

This was formalized into a multiprocessor memory model in a 1993 paper by Ahamad, Neiger, Burns, et al, but a more useful (and non-paywalled) reference might be Mahajan, Alvisi, and Dahlin’s Consistency, Availability, and Convergence, which takes a history (called “an execution”) composed of read and write operations, asks for a directed acyclic happens-before relation over that execution, and requires that happens-before…

  1. Reflects the serial ordering on each process (“node”), and
  2. Each read returns the latest preceding concurrent writes

Mahajan et al leave conflict resolution (what to do with concurrent writes) up to the implementation, requiring only that all concurrent writes are returned for a read.

A more concise definition, from Viotti and Vukolić, decomposes causal consistency into three properties:

  1. CausalVisibility (the happens-before relation is a subset of the visibility order)
  2. CausalArbitration (the happens-before relation is a subset of the arbitration order: a total order which defines how conflicts are resolved)
  3. RVal (return values should be consistent with the definition of the datatype; e.g. reads should reflect recent writes)