Long Fork happens when two transactions T1 and T2 make concurrent, disjoint updates which causes the state of the system to logically fork in twain, and other transactions are able to interact with those divergent states before the forks are logically merged. The forks are “long” because each involves multiple transactions—if each fork involves only a single transaction, we call that Short Fork, also known as A5B or Write Skew.
In terms of dependencies, a minimal example of Long Fork involves at least four transactions: the two disjoint writers T1 and T2, and a pair of read transactions T3 (which observes T1 but not T2), and T4 (which observes T2 but not T1). If we assume T3 and T4 observe the versions immediately prior to T2 and T1’s writes, respectively, we have a compact cycle: T1 wr T3 rw T2 wr T4 rw T1. If there are additional versions between the readers and writers, the cycle involves additional write-write edges. However, it always includes at least two read-write dependencies separated by write-read (or perhaps write-write) dependencies. Long Fork is therefore a subtype of G-nonadjacent.
For example, imagine two scientists, Alexis and Latrice, are collaborating on a paper. Latrice goes on a vacation with her girlfriend Li, at a cabin with no internet access. Alexis writes the background section of the paper, while Latrice independently writes up their results. Alexis shows her version of the paper to her colleague Abby. Meanwhile, Latrice asks Li to proofread her work. Because they are viewing independent forks of the paper, Abby and Li see only the changes made by Alexis and Latrice, respectively. When Latrice returns, her laptop syncs her changes with Alexis’s, producing a single document incorporating both changes. However, Alexis’s changes to the paper’s background altered the section numbering. Latrice’s results now refer to incorrect section numbers. None of the four women could have seen this problem until the final sync.
Long Fork is prohibited by Serializability and Snapshot Isolation, but allowed by Parallel Snapshot Isolation.
Literature
Long Fork comes from Sovran, Power, Aguilera, and Li’s Transactional Storage for Geo-Replicated Systems, which introduced Parallel Snapshot Isolation. Their definition is:
Long fork. Transactions make concurrent disjoint updates causing the state to fork. After they commit, the state may remain forked but it is later merged back. Example. Initially A=B=0. T1 reads A=B=0, writes A←1, and commits; then T2 reads A=1, B=0. T3 and T4 execute concurrently with T1 and T2, as follows. T3 reads A=B=0, writes B←1, and commits; then T4 reads A=0, B=1. Finally, after T1,…,T4 finish, T5 reads A=B=1.
Cerone, Bernardi, and Gotsman’s A Framework for Transactional Consistency Models with Atomic Visibility gives an example of Long Fork in terms of their abstract execution formalism.
The axiom TRANSVIS in causal consistency and parallel snapshot isolation guarantees that VIS-ordered transactions are observed by others in this order…. However, the axiom allows two concurrent transactions to be observed in different orders, as illustrated by the long fork anomaly in Figure 3(d), allowed by both models. Concurrent transactions T1 and T2 write to x and y, respectively. A transaction T3 observes the write to x, but not y, and a transaction T4 observes the write to y, but not x. Thus, from the perspective of T3 and T4, the writes of T1 and T2 happen in different orders.
See Also
Short Fork occurs when the two forks have only a single transaction.
Long Fork is a specific kind of G-nonadjacent.