Jepsen

Distributed Systems Safety Analysis

Jepsen is an effort to improve the safety of distributed databases, queues, consensus systems, etc. It encompasses a software library for systems testing, as well as blog posts, and conference talks exploring particular systems’ failure modes. In each post we explore whether the system lives up to its documentation’s claims, file new bugs, and suggest recommendations for operators.

Jepsen pushes vendors to make accurate claims and test their software rigorously, helps users choose databases and queues that fit their needs, and teaches everyone how to evaluate distributed systems correctness.

Jepsen started as a project in my nights and weekends. From February to November of 2015 I continued analyses full time at Stripe. I’m now continuing Jepsen research as an independent contractor.

Analyses

Aerospike 3.5.4
Cassandra 2.0.0
Chronos 2.4.0
Crate 0.54.9
Elasticsearch 1.1.0, 1.5.0
etcd 0.4.1
Kafka 0.8 beta
MariaDB Galera 10.0
MongoDB 2.4.3, 2.6.7
NuoDB 1.2
Percona XtraDB Cluster 5.6.25
RabbitMQ 3.3.0
Redis 2.6.13, experimental WAIT
RethinkDB 2.1.5, 2.2.3
Riak 1.2.1
VoltDB 6.3
Zookeeper 3.4.5

Techniques

Jepsen occupies a particular niche of the correctness testing landscape. We emphasize:

Request a test

Would you like to see a system analyzed? Send an email to aphyr@jepsen.io and we can work out a timeline and contract for the system of your choice. I’m available for hourly consulting or for a full analysis and writeup, which take a couple months.