JEPSEN

Distributed Systems Safety Research

About Jepsen

Jepsen aims to improve the safety of distributed databases, queues, consensus systems, etc. We maintain an open source library for safety testing, and publish free, in-depth analyses of specific systems. In each analysis we explore whether the system lives up to its documentation’s claims, file new bugs, and suggest recommendations for operators. In addition to paid analysis, Jepsen offers technical talks, training classes, and consulting services.

Jepsen pushes vendors to make accurate claims and test their software rigorously, helps users choose databases and queues that fit their needs, and teaches engineers how to evaluate distributed systems correctness for themselves.

News

Recent research, analyses, and announcements.

NATS 2.12.1

2025-12-07

NATS is a popular distributed streaming system. Jepsen tested NATS 2.12.1, focusing on its durable JetStream subsystem, and found that it could lose data or get stuck in persistent split-brain in response to file corruption or simulated node failures. This data loss was caused in part by a default fsync policy which flushed data to disk once every two minutes, rather than before acknowledgement. Even a single kernel crash or power failure, combined with process pauses or network partitions, could cause NATS replicas to lose acknowledged messages. NATS has documented its default lazy fsync setting, and is considering the other issues we found.

Jepsen 0.3.10

2025-12-01

A new Jepsen release, 0.3.10, is now available on GitHub and Clojars. This release is aimed at controllable entropy and support for running Jepsen inside Antithesis: a deterministic simulation testing environment. A new supporting library, jepsen.generator, provides the current generator system along with jepsen.random: a new namespace for pluggable random value generation. Jepsen uses these RNGs throughout, which makes it possible to run a test with a deterministic seed, or to source entropy from an external system, like Antithesis. The jepsen.antithesis library provides additional support for assertions, randomness, and lifecycle operations, plus wrappers for clients and checkers.

Also, this release introduces a new kind of visualization: op color plots, which show operations over time with different user-defined colors. This is particularly helpful for getting a feeling for “when did we lose data?” or “did only read-only queries succeed during a partition?”

Jepsen and Antithesis wrote A Distributed Systems Reliability Glossary: a free reference for engineers who build, test, and operate distributed systems. It covers basic concurrency theory, consistency models, various faults, approaches to testing, and offers some links to further reading.

The latest Jepsen talk, “Jepsen 18: Serializable Mom”, is now available on Youtube. This talk was presented on June 20, 2025, at Systems Distributed in Amsterdam. It covers Bufstream 0.1.0, Amazon RDS for PostgreSQL 17.4, and TigerBeetle 0.16.1.

Capela dda5892

2025-08-06

Jepsen and Capela, Inc worked together to test early builds of Capela, an unreleased distributed programming environment. Our analysis found twenty-two issues, including four problems in Capela’s programming language semantics, fourteen crashes or non-fatal panics, severe performance degradation after roughly a minute of operation, and three safety issues: partitions ignoring their initial values, sporadically vanishing, and losing committed writes. Capela has fixed two of the language issues—the others remain under investigation.

All news from Jepsen…