JEPSEN

Repeatable Read

Repeatable read is closely related to serializability, but unlike serializable, it allows phantoms: if a transaction T1 reads a predicate, like "the set of all people with the name “Dikembe”, then another transaction T2 may create or modify a person with the name “Dikembe” before T1 commits. Individual objects are stable once read, but the predicate itself may not be.

Repeatable read is a transactional model: operations (usually termed “transactions”) can involve several primitive sub-operations performed in order. It is also a multi-object property: operations can act on multiple objects in the system.

Repeatable read cannot be totally available; in the presence of network partitions, some or all nodes may be unable to make progress. For total availability, at the cost of allowing fuzzy reads, consider read committed.

Repeatable read implies cursor stability, read committed, etc.

Note that repeatable read does not impose any real-time constraints. If process A completes write w, then process B begins a read r, r is not necessarily guaranteed to observe w. For a transactional model that provides real-time constraints, consider strict serializability.

Moreover, repeatable read does not require a per-process order between transactions. A process can observe a write, then fail to observe that same write in a subsequent transaction. In fact, a process can fail to observe its own prior writes, if those writes occurred in different transactions.

Like serializability, repeatable read allows pathological orderings. For instance, a repeatable-read database can always return the empty state for any reads, by appearing to execute those reads at time 0. It can also discard write-only transactions by reordering them to execute at the very end of the history, after any reads. Operations like increments can also be discarded, assuming the result of the increment is never observed. Luckily, most implementations don’t seem to take advantage of these optimization opportunities.

Formally

The ANSI SQL 1999 spec specifies repeatable read as read committed but disallowing phenomenon P2:

P2 (“Non-repeatable read”): SQL-transaction T1 reads a row. SQL-transaction T2 then modifies or deletes that row and performs a COMMIT. If T1 then attempts to reread the row, it may receive the modified value or discover that the row has been deleted.

However, as Berenson, Bernstein, et al observed, the ANSI specification allows multiple intepretations, and one of those interpretations (the "anomaly interpretation) still admits nonserializable histories for “serializable” systems. Instead, we prefer Adya’s formalization of transactional isolation levels, which provides a concise definition of the preventative interpretation. In this model, repeatable read prohibits:

  • P0 (Dirty Write): w1(x) … w2(x)
  • P1 (Dirty Read): w1(x) … r2(x)
  • P2 (Fuzzy Read): r1(x) … w2(x)

but allows:

  • P3 (Phantom): r1(P) … w2(y in P)

Here w denotes a write, r denotes a read, and subscripts indicate the transaction which executed that operation. The notation “…” indicates a series of micro-operations except for a commit or abort. P indicates a predicate. Note that unlike the ANSI SQL spec, T1 does not necessarily execute a second read; modification of the predicate is enough.