JEPSEN

Distributed Systems Safety Research

About Jepsen

Jepsen is an effort to improve the safety of distributed databases, queues, consensus systems, etc. We maintain an open source software library for systems testing, as well as blog posts and conference talks exploring particular systems’ failure modes. In each analysis we explore whether the system lives up to its documentation’s claims, file new bugs, and suggest recommendations for operators.

Jepsen pushes vendors to make accurate claims and test their software rigorously, helps users choose databases and queues that fit their needs, and teaches engineers how to evaluate distributed systems correctness for themselves.

In addition to public analyses, Jepsen offers technical talks, training classes, and a variety of consulting services.

Other Resources

News

Recent research, analyses, and announcements.

Kyle Kingsbury will speak on performance techniques in Jepsen at GOTO Chicago, October 21 & 22, 2024. The talk will touch on a mix of high-level and low-level performance optimizations to make checking large histories tractable, including parallelism, pure functions, immutable data structures, and deforestation; bitsets, avoiding sharing between threads, packing structures into mutable arrays, dynamic compilation of primitive boxes, and macro iteration magic.

Early bird tickets are on sale now.

We’ve made some small changes to the Jepsen ethics policy.

The policy used to promise that Jepsen could veto publication if Jepsen and a client could not agree on the content of an analysis. However, this veto has never been used. In fact, Jepsen’s contracts have given Jepsen final approval over the content of analyses since 2016. We replace the promise of a veto with a stronger promise of editorial control.

In light of Jepsen’s multiple authors, we also shift to an organizational third person voice. Finally, we’ve streamlined some language.

In collaboration with Nubank, we analyzed Datomic Pro 1.0.7075 and found that its inter-transaction safety properties appeared stronger than claimed. Datomic Pro appeared to offer Strong Session Serializable isolation, and Strong Serializable for histories restricted to update transactions. However, Datomic defines unusual intra-transaction semantics in which operations are applied logically concurrent with one another, rather than sequentially. While consistent with Datomic’s documentation, this could cause invariants preserved by individual transaction functions to be broken when those same functions are applied within a single transaction.

RavenDB 6.0.2

2024-01-30

In a brief survey of RavenDB 6.0.2, we found “ACID” transactions allowed both lost updates and fractured read, even in healthy single-node clusters. Depending on how you interpret RavenDB’s documentation and response to this work, RavenDB may not have interactive transactions at all.

MySQL 8.0.34

2023-12-18

We revisited Kleppmann’s work on MySQL isolation levels and found surprising behavior in 8.0.34. MySQL’s REPEATABLE READ not only exhibits G2-item, G-single, and lost update, but also violates internal consistency and Monotonic Atomic View. It satisfies neither Adya’s Repeatable Read nor the ambiguous ANSI SQL definition. We also discovered AWS RDS MySQL clusters routinely violate Serializability at the SERIALIZABLE isolation level.

All news from Jepsen…