Auto-Heal Behavior in RabbitMQ 4.x (3-Node Cluster) #14932
-
Community Support Policy
RabbitMQ version used4.0.x How is RabbitMQ deployed?Kubernetes Operator(s) from Team RabbitMQ Steps to reproduce the behavior in questionHej , We're trying to better understand the auto-heal functionality in RabbitMQ. Here's the scenario:
This is the part we don't understand:
Any insight into the auto-heal partition resolution logic would be greatly appreciated. Thanks in advance! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
|
Hi, a few things to bring up:
|
Beta Was this translation helpful? Give feedback.
-
Because that's how the autoheal partition healing strategy was designed to work: it restarts all nodes except for the "winning" one. It's a very straightforward way to address Mnesia's highly opinionated approach to recovery from partitions.
Like I said, with classic mirrored queues removed in 4.0 and Khepri becoming the only metadata store in 4.3, partition handling strategies will be fairly soon gone from RabbitMQ: Khepri, quorum queues, stream coordinator all recover the way the Raft consensus algorithm requires the recovery process to work. |
Beta Was this translation helpful? Give feedback.
@uk1988
Because that's how the autoheal partition healing strategy was designed to work: it restarts all nodes except for the "winning" one. It's a very straightforward way to address Mnesia's highly opinionated approach to recovery from partitions.
pause_minorityworks differently. Team RabbitMQ recommends it over autoheal for most users.Like I said, with classic mirrored queues removed in 4.0 and Khepri becoming the only metadata store in 4.3, partition handling strategies will be fairly soon gone from RabbitMQ: Khepri, quorum queues, stream coordinator all recover th…