JEPSEN

Lost Write

A Lost Write (or Write Loss) occurs when a system acknowledges a write as complete, but the effects of that write are no longer present after some time. Writes may be lost immediately—for instance, when a server acknowledges a write to a client, then crashes before writing to disk. They may also be lost later—for instance, when an unsafe leader-election system selects a new leader who has not received some acknowledged writes.

In distributed systems, Write Loss is tricky to define precisely. Nodes may, in normal operation, disagree about whether a write took place. A write may also be present on one or more servers but not returned by reads—this is functionally a Lost Write.

We must also be careful to distinguish successful updates from Write Loss. Registers, for instance, discard previous states on every write; setting x = 1, then x = 2, is not (necessarily) Write Loss. In, say, a CRDT where writes add unique elements to a set, Write Loss is easier to detect; each version uniquely reflects every write which causally preceded it.

In systems without real-time guarantees, reads may fail to reflect writes for some arbitrary duration: a Stale Read may be technically indistinguishable from a Lost Write. Stale Reads should eventually end, whereas Lost Write is forever. We often distinguish the two by allowing the system time to quiesce after all writes have completed, and performing a series of final reads in the hope that “enough” time has passed; this is technically improper, but often useful. In other circumstances, Write Loss can be inferred by showing that (e.g.) the effects of the write are missing from every node, making later recovery impossible.

For these reasons, Jepsen’s working definition of a Lost Write is as follows. A write w is considered lost if, after some time t, every read reflects a state of the system which is not a causal descendant of w.

Literature

The term Lost Write is more common in the storage literature (for instance Fryer et al’s Checking the Integrity of Transactional Mechanisms) than in databases, but appears as early as 1995 in Farnham & Foxon’s An Efficient Recovery Protocol for Distributed Network Planning Information with Network Partitioning and Equipment Failure.