Distributed Systems Safety Analysis
Jepsen is an effort to improve the safety of distributed databases, queues, consensus systems, etc. It encompasses a software library for systems testing, as well as blog posts, and conference talks exploring particular systems’ failure modes. In each post we explore whether the system lives up to its documentation’s claims, file new bugs, and suggest recommendations for operators.
Jepsen pushes vendors to make accurate claims and test their software rigorously, helps users choose databases and queues that fit their needs, and teaches everyone how to evaluate distributed systems correctness.
Jepsen started as a project in my nights and weekends. From February to November of 2015 I continued analyses full time at Stripe. I’m now continuing Jepsen research as an independent contractor.
|Percona XtraDB Cluster||5.6.25|
|Redis||2.6.13, experimental WAIT|
Jepsen occupies a particular niche of the correctness testing landscape. We emphasize:
Black-box systems testing: we evaluate real binaries running on real clusters. This allows us to test systems without access to their source, and without requiring deep packet inspection, formal annotations, etc. Bugs reproduced in Jepsen are observable in production, not theoretical. However, we sacrifice some of the strengths of formal methods: tests are nondeterministic, and we cannot prove correctness, only find errors.
Testing under distributed systems failure modes: faulty networks, unsynchronized clocks, and partial failure. Many test suites only evaluate the behavior of healthy clusters, but production systems experience pathological failure modes. Jepsen shows behavior under strain.
Generative testing: we construct random operations, apply them to the system, and construct a concurrent history of their results. That history is checked against a model to establish its correctness. Generative (or property-based) tests often reveal edge cases with subtle combinations of inputs.
Request a test
Would you like to see a system analyzed? Send an email to firstname.lastname@example.org and we can work out a timeline and contract for the system of your choice. I’m available for hourly consulting or for a full analysis and writeup, which take a couple months.