JEPSEN

Long Fork

Long Fork happens when two transactions T₁ and T₂ make concurrent, disjoint updates which causes the state of the system to logically fork in twain, and other transactions are able to interact with those divergent states before the forks are logically merged. The forks are “long” because each involves multiple transactions—if each fork involves only a single transaction, we call that Short Fork, also known as A5B or Write Skew.

In terms of dependencies, a minimal example of Long Fork involves at least four transactions: the two disjoint writers T₁ and T₂, and a pair of read transactions T₃ (which observes T₁ but not T₂), and T₄ (which observes T₂ but not T₁). If we assume T₃ and T₄ observe the versions immediately prior to T₂ and T₁’s writes, respectively, we have a compact cycle: T₁ wr T₃ rw T₂ wr T₄ rw T₁. If there are additional versions between the readers and writers, the cycle involves additional write-write edges. However, it always includes at least two read-write dependencies separated by write-read (or perhaps write-write) dependencies. Long Fork is therefore a subtype of G-nonadjacent.

For example, imagine two scientists, Alexis and Latrice, are collaborating on a paper. Latrice goes on a vacation with her girlfriend Li, at a cabin with no internet access. Alexis writes the background section of the paper, while Latrice independently writes up their results. Alexis shows her version of the paper to her colleague Abby. Meanwhile, Latrice asks Li to proofread her work. Because they are viewing independent forks of the paper, Abby and Li see only the changes made by Alexis and Latrice, respectively. When Latrice returns, her laptop syncs her changes with Alexis’s, producing a single document incorporating both changes. However, Alexis’s changes to the paper’s background altered the section numbering. Latrice’s results now refer to incorrect section numbers. None of the four women could have seen this problem until the final sync.

Long Fork is prohibited by Serializability and Snapshot Isolation, but allowed by Parallel Snapshot Isolation.

Literature

Long Fork comes from Sovran, Power, Aguilera, and Li’s Transactional Storage for Geo-Replicated Systems, which introduced Parallel Snapshot Isolation. Their definition is:

Long fork. Transactions make concurrent disjoint updates causing the state to fork. After they commit, the state may remain forked but it is later merged back. Example. Initially A=B=0. T₁ reads A=B=0, writes A←1, and commits; then T₂ reads A=1, B=0. T₃ and T₄ execute concurrently with T₁ and T₂, as follows. T₃ reads A=B=0, writes B←1, and commits; then T₄ reads A=0, B=1. Finally, after T₁,…,T₄ finish, T₅ reads A=B=1.

Cerone, Bernardi, and Gotsman’s A Framework for Transactional Consistency Models with Atomic Visibility gives an example of Long Fork in terms of their abstract execution formalism.

The axiom TRANSVIS in causal consistency and parallel snapshot isolation guarantees that VIS-ordered transactions are observed by others in this order…. However, the axiom allows two concurrent transactions to be observed in different orders, as illustrated by the long fork anomaly in Figure 3(d), allowed by both models. Concurrent transactions T₁ and T₂ write to x and y, respectively. A transaction T₃ observes the write to x, but not y, and a transaction T₄ observes the write to y, but not x. Thus, from the perspective of T₃ and T₄, the writes of T₁ and T₂ happen in different orders.

JEPSEN

Long Fork

Literature

See Also