Recovering a JetStream Cluster After Quorum Loss
Losing quorum in a JetStream cluster is one of those situations that feels worse than it is. Meta operations stop, health checks fail, and logs fill with warnings. But with a clear understanding of what’s happening, recovery is straightforward.
This expands on a post I wrote for the NATS blog in 2024.
Important: behavior changed in nats-server 2.12. The scale-up recovery procedure below was the sanctioned path on 2.10 and 2.11. As of 2.12.0 (PR #7038), adding a fresh empty node no longer restores quorum: the new node can no longer force itself into the peer set, so the meta election never passes. The previous behavior was Raft-unsafe (an empty node could vote for a leader without the full data set), and 2.12 closed the gap. If you are running 2.12 or later, see Recovery on 2.12 and Later before following the rest of this guide.
Recognizing Quorum Loss
A JetStream cluster requires a majority of nodes to agree before making changes.1 In a 3-node cluster, you need 2. In a 5-node cluster, you need 3. When that majority can’t be reached, the cluster can’t elect a meta leader – meaning no stream or consumer creation, deletion, or modification. Existing streams may continue operating if they have independent quorum.2
You’ll see messages like these in the logs:
[WRN] Healthcheck failed: "JetStream has not established contact with a meta leader"
[INF] JetStream cluster no metadata leader
In Kubernetes, you may see pods failing readiness probes:
Readiness probe failed: HTTP probe failed with statuscode: 503
The NATS CLI will confirm the problem:
$ nats server report jetstream
WARNING: No cluster meta leader found. The cluster expects 6 nodes but only 3 responded.
JetStream operations require at least 4 up nodes.
Common Causes
The most common cause I’ve seen: renaming cluster nodes in bulk rather than one at a time.
When you rename nodes simultaneously, the cluster ends up expecting both the old and new names. If you had 3 nodes and renamed all of them, the cluster now expects 6 nodes but only 3 are responding. You’ve lost quorum not because nodes are down, but because the cluster is waiting for nodes that no longer exist.
Other causes include:
- Network partitions isolating a minority of nodes
- Storage failures preventing nodes from participating
- Misconfigured cluster membership
Recovery on 2.12 and Later
On 2.12+, the safe recovery path is to bring downed nodes back under their original names. The cluster recognizes peers by name; restoring the original names is what re-establishes quorum. The behavior depends on whether the returning peers have their data:
- If the missing peers come back with their data intact (for example, after reverting a
serverNamePrefixchange while the original PVCs are still attached), normal Raft quorum applies. The meta election passes once a majority of originally-named peers are online with their data (4 of 6 in a 6-node cluster). - If the missing peers come back with empty disks, PR #7038 requires that all originally-named peers be available at the same time. An empty-disk peer votes with an “empty vote” that only counts when every server in the original set also votes. If even one original peer cannot come back, this path is blocked.
For the example above (6-node cluster, 3 nodes down), restart the original pods or rebuild them with the same names:
# Restart the originally-named pods. Existing PVCs reattach.
kubectl rollout restart statefulset/nats
# Or if the StatefulSet has been scaled down, scale it back to the original size
# so the original ordinals (nats-3, nats-4, nats-5) are recreated.
kubectl scale --replicas=6 statefulset/nats
# Verify quorum returns
nats server report jetstream
Once a leader is elected, clean up any stale peer entries left over from the failure (see Clean Up Stale Peers below).
Clean Up Stale Peers
After quorum is restored, the meta group may still list peer entries left over from the failure. Remove them with peer-remove, using the peer ID from nats server report jetstream:
nats server cluster peer-remove -f <name_or_peer_id>
The same-name requirement is why a bulk rename followed by a revert can leave a 2.12+ cluster bricked: the new names are no longer in the original peer set, and reverting does not put them back. Bringing the originals online is the only in-product recovery today.
If original names cannot be reclaimed (machines truly destroyed, hostnames unrecoverable) and the cluster has lost quorum on 2.12+, there is no in-product unbricking path at the time of writing. Engage Synadia support. A break-glass SYS API endpoint is in design that will let an operator explicitly relax the empty-node safety check for a one-shot recovery; it is not yet shipped.
Recovery on 2.10 and 2.11
The procedure below is the path documented in the 2024 NATS blog post. It relies on the pre-2.12 behavior where a fresh empty node could force itself into the peer set; that path is closed on 2.12 and later, as noted in the callout at the top of this page.
1. Regain Quorum First
You can’t remove stale peers without a leader, and you can’t elect a leader without quorum.3 The first step is getting enough nodes online to reach majority.
If the cluster expects N nodes but only M are responding (where M < N/2 + 1), you need to add nodes.
Examples below use Kubernetes; adapt the scaling commands for your deployment:
# Example: cluster expects 6 nodes, 3 responding - need 4 for quorum
kubectl scale --replicas=4 statefulset/nats
The new pod joins under its own name (e.g., nats-3). On 2.10/2.11 it forces itself into the peer set on first contact, which lifts the quorum number enough for an election to pass even though the new node is empty. After the leader is elected, you’ll remove the stale entries in the next step. The new node does not need to match any stale name – it just provides the vote count needed to unblock the meta group.
Wait for the new node to join and a leader to be elected.
2. Remove Stale Peers
Once a leader exists, you can remove the old peer entries.4 Peer IDs are shown in nats server report jetstream.
Peer removal signals to JetStream that a node will never return.
The peer-remove command accepts either server name or peer ID:
# Using the CLI (preferred)
nats server cluster peer-remove -f <name_or_peer_id>
# Or using the JetStream API directly (internal, subject to change).
# The "peer" field is the server name; "peer_id" wins when both are set.
# Use peer_id alone for unambiguous removal:
nats publish '$JS.API.SERVER.REMOVE' '{"peer_id":"<peer_id>"}'
You’ll get confirmation:
{
"type": "io.nats.jetstream.api.v1.meta_server_remove_response",
"success": true
}
3. Clean Up
After removing stale peers, scale back to your desired replica count and remove the temporary node:
kubectl scale --replicas=3 statefulset/nats
nats server cluster peer-remove -f <temporary_peer_id>
Kubernetes scales down a StatefulSet by terminating the highest-ordinal pod. Verify with kubectl get pods that the temporary node is the one being terminated before running peer-remove on its ID – if pods were created or deleted out of order earlier, the highest ordinal may not be your temporary node.
4. Check Stream and Consumer Groups
Meta group recovery gets the cluster operational, but individual stream and consumer Raft groups may also have stale peers. Check each stream:
nats stream info <stream_name> --json | jq '.cluster'
If stale peers appear in stream groups, remove them:
nats stream cluster peer-remove -f <stream_name> <name_or_peer_id>
Durable consumers on the removed peer are reassigned to the surviving stream peers. Ephemeral consumers whose only replica lived on the removed peer are deleted, not migrated – if ephemeral consumers matter for your workload, recreate them after peer removal.
Note: removing a peer from a stream group triggers automatic replica rebalancing. For R1 streams where the only replica lived on the removed peer, the stream assignment is dropped and the data is not recoverable from the cluster – there is no replica behind it. Confirm R-factor with nats stream info before running peer-remove against a stream group.
Prevention
The key insight: Raft-based systems track membership by node identity. Changing identities in bulk looks like a mass failure followed by new nodes joining, which confuses the membership tracking.
When renaming nodes or making identity changes, drive them through lame duck mode one node at a time:
- Take one node into lame duck before restarting under a new name
- Wait for the cluster to stabilize between changes
- Verify
nats server report jetstreamshows healthy state before proceeding
This patience prevents the quorum loss scenario entirely – and on 2.12+, where reverting a bulk rename no longer unbricks the cluster on its own, it is the only reliable way to avoid the failure mode.
-
JetStream Clustering - A quorum is half the cluster size plus one, the minimum number of nodes needed to ensure data consistency after failure. ↩︎
-
Each stream creates its own Raft group independent of the meta group. A stream with R3 can continue serving reads and writes even while the meta group has no leader, as long as its own three nodes have quorum. See JetStream Clustering. ↩︎
-
Disaster Recovery - NATS will create replacement stream replicas automatically once quorum is restored and stale nodes are removed. ↩︎
-
Recovering NATS JetStream Quorum - Official NATS blog post with detailed recovery steps. The
peer-removecommand removes a node from the cluster’s Raft meta group and triggers automatic replica rebalancing. ↩︎