NATS is a popular distributed streaming system. Jepsen tested NATS 2.12.1, focusing on its durable JetStream subsystem, and found that it could lose data or get stuck in persistent split-brain in response to file corruption or simulated node failures. This data loss was caused in part by a default fsync policy which flushed data to disk once every two minutes, rather than before acknowledgement. Even a single kernel crash or power failure, combined with process pauses or network partitions, could cause NATS replicas to lose acknowledged messages. NATS has documented its default lazy fsync setting, and is considering the other issues we found.
NATS 2.12.1
2025-12-07