You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Support for PreVote optimisation. Nodes understand and are able to respond to PreVote messages, but will not become pre-vote candidates themselves. (#7419, #7445)
15
+
16
+
### Fixed
17
+
18
+
- CheckQuorum now requires a quorum in every configuration (#7375)
Copy file name to clipboardExpand all lines: doc/architecture/consensus/index.rst
+73Lines changed: 73 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,6 +25,7 @@ Supported extensions include:
25
25
26
26
- "CheckQuorum": the primary node automatically steps down, in the same view, if it does not hear back (via ``AppendEntriesResponse`` messages) from a majority of backups within a ``consensus.election_timeout`` period. This prevents an isolated primary node from still processing client write requests without being able to commit them.
27
27
- "NoTimeoutRetirement": a primary node that completes its retirement sends a ProposeRequestVote message to the most up-to-date node in the new configuration, causing that node to run for election without waiting for time out.
28
+
- "PreVote": followers must first request a pre-vote before starting a new election. This prevents followers from starting elections (and increasing the term) when they are isolated from the rest of the network.
28
29
29
30
Replica State Machine
30
31
---------------------
@@ -206,3 +207,75 @@ Until the very last phase (``RetiredCommitted``) is reached, a retiring leader w
206
207
207
208
Note that because the rollback triggered when a node becomes aware of a new term never preserves unsigned transactions,
208
209
and because RCI is always the first signature after RI, RI and RCI are always both rolled back if RCI itself is rolled back.
210
+
211
+
PreVote Extensions
212
+
~~~~~~~~~~~~~~~~~~
213
+
214
+
If a node's ``RequestVote`` requests are able to reach the cluster, but it is unable to hear the ``AppendEntries`` messages from the current leader (for example, due to network partitioning), it may start new elections, incrementing its term, which deposes the leader and disrupts the cluster.
215
+
216
+
To mitigate this, the PreVote extension requires that a follower first become ``PreVoteCandidate`` and receive a quorum of speculative pre-votes, proving that they could be elected using the standard Raft election conditions, before becoming ``Candidate`` and potentially disrupting the cluster.
217
+
218
+
More specifically, when a follower's election timeout elapses, it becomes a ``PreVoteCandidate`` for the current term and sends out ``RequestVote`` messages with the ``electionType`` set to ``ElectionType::PreVote``.
219
+
If the ``PreVoteCandidate`` hears from a current leader, or a new leader, it reverts back to being a ``Follower``.
220
+
Nodes receive this pre-vote request, and respond positively if node would have voted for the ``PreVoteCandidate``'s ledger during an election, (ie. if the ``PreVoteCandidate``'s ledger is at least as up to date as the receiver's ledger).
221
+
If the ``PreVoteCandidate`` receives a quorum of positive pre-vote responses, it then becomes a ``Candidate``, increments its term, sends a ``RequestVote`` message with ``election_type`` set to ``ElectionType::RegularVote`` and the election proceeds as normal from here.
Supposing we have a cluster of nodes which currently do not support PreVote, we must first migrate the cluster to support PreVote before we can enable it, as the nodes that do not support PreVote will respond incorrectly to PreVote requests.
279
+
280
+
To enable PreVote safely, we must first migrate the cluster to support PreVote messages, and then enable PreVote.
281
+
During the migration to enable PreVote, the pre-vote candidates will be less likely to be elected leader, as the other followers may preempt the pre-vote candidate and become candidates themselves.
Copy file name to clipboardExpand all lines: doc/architecture/raft_tla.rst
+3Lines changed: 3 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -44,4 +44,7 @@ It is possible to produce fresh traces quickly from the driver by running the ``
44
44
45
45
Calling the trace validation on, for example, the ``append`` scenario can then be done with ``./tlc.py --driver-trace ../build/append.ndjson consensus/Traceccfraft.tla``.
46
46
47
+
Generating a trace of a scenario and validating it in one go can be done with ``./tlc.py --workers 1 tv --scenario ../tests/raft_scenarios/append consensus/Traceccfraft.tla``.
48
+
This runs the raft_driver on the scenario, cleans the trace and then validates it against the TLA+ specification.
49
+
47
50
CCF also provides a command line trace visualizer to aid debugging, for example, the ``append`` scenario can be visualized with ``python ../tests/trace_viz.py ../build/append.ndjson``.
0 commit comments