Skip to content

Commit 5120ce9

Browse files
authored
[release/6.x] Backport pre-vote capability (#7374, #7375, #7404, #7409, #7438, #7445, #7458) (#7436)
1 parent 4fda3f5 commit 5120ce9

29 files changed

+1289
-246
lines changed

.github/workflows/ci-verification.yml

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ jobs:
3939
- name: Install TLC dependencies
4040
run: |
4141
tdnf install -y jre wget
42-
python3 tla/install_deps.py --skip-apt-packages
42+
python3 tla/install_deps.py
4343
4444
- run: cd tla && ./tlc.py mc consistency/MCSingleNode.tla
4545
- run: cd tla && ./tlc.py mc consistency/MCSingleNodeReads.tla
@@ -68,7 +68,7 @@ jobs:
6868
- name: Install TLC dependencies
6969
run: |
7070
sudo apt update
71-
sudo apt install -y default-jre
71+
sudo apt install -y default-jre wget
7272
python3 install_deps.py
7373
7474
- run: ./tlc_debug.sh --config consistency/MCSingleNodeCommitReachability.cfg mc consistency/MCSingleNodeReads.tla
@@ -88,7 +88,7 @@ jobs:
8888
- name: Install TLC dependencies
8989
run: |
9090
sudo apt update
91-
sudo apt install -y default-jre
91+
sudo apt install -y default-jre wget
9292
python3 install_deps.py
9393
9494
- run: ./tlc.py sim --num 500 --depth 50 consistency/MultiNodeReads.tla
@@ -121,7 +121,7 @@ jobs:
121121
- name: Install TLC dependencies
122122
run: |
123123
tdnf install -y jre wget
124-
python3 tla/install_deps.py --skip-apt-packages
124+
python3 tla/install_deps.py
125125
126126
- run: cd tla && ./tlc.py mc consensus/MCabs.tla
127127
- run: cd tla && ./tlc.py --trace-name 1C2N mc --term-count 2 --request-count 2 --raft-configs 1C2N consensus/MCccfraft.tla
@@ -148,7 +148,7 @@ jobs:
148148
- name: Install TLC dependencies
149149
run: |
150150
sudo apt update
151-
sudo apt install -y default-jre
151+
sudo apt install -y default-jre wget
152152
python3 install_deps.py
153153
154154
- run: ./tlc.py sim consensus/SIMccfraft.tla
@@ -181,22 +181,16 @@ jobs:
181181
with:
182182
fetch-depth: 0
183183

184-
- name: Install TLC dependencies
185-
run: |
186-
tdnf install -y jre wget
187-
python3 tla/install_deps.py --skip-apt-packages
188-
189184
- name: "Install dependencies"
190185
shell: bash
191186
run: |
192187
set -ex
193188
./scripts/setup-ci.sh
194189
195-
# Parallel
196-
wget https://ftp.gnu.org/gnu/parallel/parallel-latest.tar.bz2
197-
tar -xjf parallel-latest.tar.bz2
198-
cd $(ls | grep 'parallel' | grep -v 'tar' | grep -v 'rpm')
199-
./configure && make && make install
190+
- name: Install TLC dependencies
191+
run: |
192+
tdnf install -y jre wget
193+
python3 tla/install_deps.py --tdnf-extended
200194
201195
- name: "Build"
202196
run: |

.github/workflows/long-verification.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ jobs:
3737
- name: Install TLC dependencies
3838
run: |
3939
tdnf install -y jre wget
40-
python3 tla/install_deps.py --skip-apt-packages
40+
python3 tla/install_deps.py
4141
4242
- run: cd tla && ./tlc.py --trace-name 2C2N mc --term-count 2 --request-count 0 --raft-configs 2C2N --disable-check-quorum consensus/MCccfraft.tla
4343

@@ -70,7 +70,7 @@ jobs:
7070
- name: Install TLC dependencies
7171
run: |
7272
tdnf install -y jre wget
73-
python3 tla/install_deps.py --skip-apt-packages
73+
python3 tla/install_deps.py
7474
7575
- run: cd tla && ./tlc.py --trace-name 3C2N mc --term-count 2 --request-count 0 --raft-configs 3C2N --disable-check-quorum consensus/MCccfraft.tla
7676

@@ -95,7 +95,7 @@ jobs:
9595
- uses: actions/checkout@v4
9696
- run: |
9797
sudo apt update
98-
sudo apt install -y default-jre
98+
sudo apt install -y default-jre wget
9999
python3 install_deps.py
100100
101101
- run: ./tlc.py sim --max-seconds 3000 --depth 500 consensus/SIMccfraft.tla

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,18 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
66
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).
77

8+
## [6.0.17]
9+
10+
[6.0.17]: https://github.com/microsoft/CCF/releases/tag/6.0.17
11+
12+
### Added
13+
14+
- Support for PreVote optimisation. Nodes understand and are able to respond to PreVote messages, but will not become pre-vote candidates themselves. (#7419, #7445)
15+
16+
### Fixed
17+
18+
- CheckQuorum now requires a quorum in every configuration (#7375)
19+
820
## [6.0.16]
921

1022
[6.0.16]: https://github.com/microsoft/CCF/releases/tag/6.0.16

doc/architecture/consensus/index.rst

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ Supported extensions include:
2525

2626
- "CheckQuorum": the primary node automatically steps down, in the same view, if it does not hear back (via ``AppendEntriesResponse`` messages) from a majority of backups within a ``consensus.election_timeout`` period. This prevents an isolated primary node from still processing client write requests without being able to commit them.
2727
- "NoTimeoutRetirement": a primary node that completes its retirement sends a ProposeRequestVote message to the most up-to-date node in the new configuration, causing that node to run for election without waiting for time out.
28+
- "PreVote": followers must first request a pre-vote before starting a new election. This prevents followers from starting elections (and increasing the term) when they are isolated from the rest of the network.
2829

2930
Replica State Machine
3031
---------------------
@@ -206,3 +207,75 @@ Until the very last phase (``RetiredCommitted``) is reached, a retiring leader w
206207

207208
Note that because the rollback triggered when a node becomes aware of a new term never preserves unsigned transactions,
208209
and because RCI is always the first signature after RI, RI and RCI are always both rolled back if RCI itself is rolled back.
210+
211+
PreVote Extensions
212+
~~~~~~~~~~~~~~~~~~
213+
214+
If a node's ``RequestVote`` requests are able to reach the cluster, but it is unable to hear the ``AppendEntries`` messages from the current leader (for example, due to network partitioning), it may start new elections, incrementing its term, which deposes the leader and disrupts the cluster.
215+
216+
To mitigate this, the PreVote extension requires that a follower first become ``PreVoteCandidate`` and receive a quorum of speculative pre-votes, proving that they could be elected using the standard Raft election conditions, before becoming ``Candidate`` and potentially disrupting the cluster.
217+
218+
More specifically, when a follower's election timeout elapses, it becomes a ``PreVoteCandidate`` for the current term and sends out ``RequestVote`` messages with the ``electionType`` set to ``ElectionType::PreVote``.
219+
If the ``PreVoteCandidate`` hears from a current leader, or a new leader, it reverts back to being a ``Follower``.
220+
Nodes receive this pre-vote request, and respond positively if node would have voted for the ``PreVoteCandidate``'s ledger during an election, (ie. if the ``PreVoteCandidate``'s ledger is at least as up to date as the receiver's ledger).
221+
If the ``PreVoteCandidate`` receives a quorum of positive pre-vote responses, it then becomes a ``Candidate``, increments its term, sends a ``RequestVote`` message with ``election_type`` set to ``ElectionType::RegularVote`` and the election proceeds as normal from here.
222+
223+
.. mermaid::
224+
225+
sequenceDiagram
226+
participant Node 0
227+
participant Node 1
228+
participant Node 2
229+
230+
Note over Node 0: Leader for term 2
231+
232+
Note over Node 1: PreVoteCandidate in term 2
233+
Node 1 ->> Node 2: RequestVote(ElectionType::PreVote, term=2)
234+
235+
Note right of Node 2: No changes to Node 2's state
236+
Node 2 ->> Node 1: RequestVoteResponse(ElectionType::PreVote, term=2, granted=true)
237+
238+
Note over Node 1: Candidate in term 3
239+
Node 1 ->> Node 2: RequestVote(ElectionType::RegularVote, term=3)
240+
241+
Note right of Node 2: Updates term to 3 and votes for Node 1
242+
Node 2 ->> Node 1: RequestVoteResponse(ElectionType::RegularVote, term=3, granted=true)
243+
244+
Note over Node 1: Leader for term 3
245+
246+
The only state update in response to a pre-vote message is that if the node's term is older than the pre-vote messages's it will update it.
247+
This allows the pre-vote request to inform lagging nodes that a more recent term had a node succeed in its pre-vote, becoming a Candidate or a Leader.
248+
This can be viewed as piggybacking the term information from that previous Candidate or Leader, with the pre-vote request to the lagging node.
249+
250+
.. mermaid::
251+
252+
sequenceDiagram
253+
participant Node 0
254+
participant Node 1
255+
participant Node 2
256+
257+
Note over Node 0: Leader for term 2
258+
Note over Node 1: Follower in term 2
259+
Note over Node 2: Lagging Follower in term 1
260+
261+
Note over Node 1: PreVoteCandidate in term 2
262+
Node 1 ->> Node 2: RequestVote(ElectionType::PreVote, term=2)
263+
264+
Note right of Node 2: Updates term to 2
265+
Node 2 ->> Node 1: RequestVoteResponse(ElectionType::PreVote, term=2, granted=true)
266+
267+
Note over Node 1: Candidate in term 3
268+
Node 1 ->> Node 2: RequestVote(ElectionType::RegularVote, term=3)
269+
270+
Note right of Node 2: Updates to term 3 and votes for Node 1
271+
Node 2 ->> Node 1: RequestVoteResponse(ElectionType::RegularVote, term=3, granted=true)
272+
273+
Note over Node 1: Leader for term 3
274+
275+
Migration to PreVote
276+
~~~~~~~~~~~~~~~~~~~~
277+
278+
Supposing we have a cluster of nodes which currently do not support PreVote, we must first migrate the cluster to support PreVote before we can enable it, as the nodes that do not support PreVote will respond incorrectly to PreVote requests.
279+
280+
To enable PreVote safely, we must first migrate the cluster to support PreVote messages, and then enable PreVote.
281+
During the migration to enable PreVote, the pre-vote candidates will be less likely to be elected leader, as the other followers may preempt the pre-vote candidate and become candidates themselves.

doc/architecture/raft_tla.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,4 +44,7 @@ It is possible to produce fresh traces quickly from the driver by running the ``
4444

4545
Calling the trace validation on, for example, the ``append`` scenario can then be done with ``./tlc.py --driver-trace ../build/append.ndjson consensus/Traceccfraft.tla``.
4646

47+
Generating a trace of a scenario and validating it in one go can be done with ``./tlc.py --workers 1 tv --scenario ../tests/raft_scenarios/append consensus/Traceccfraft.tla``.
48+
This runs the raft_driver on the scenario, cleans the trace and then validates it against the TLA+ specification.
49+
4750
CCF also provides a command line trace visualizer to aid debugging, for example, the ``append`` scenario can be visualized with ``python ../tests/trace_viz.py ../build/append.ndjson``.

doc/schemas/node_openapi.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -464,6 +464,7 @@
464464
"None",
465465
"Leader",
466466
"Follower",
467+
"PreVoteCandidate",
467468
"Candidate"
468469
],
469470
"type": "string"
@@ -904,7 +905,7 @@
904905
"info": {
905906
"description": "This API provides public, uncredentialed access to service and node state.",
906907
"title": "CCF Public Node API",
907-
"version": "4.13.0"
908+
"version": "4.14.0"
908909
},
909910
"openapi": "3.0.0",
910911
"paths": {

python/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "ccf"
7-
version = "6.0.16"
7+
version = "6.0.17"
88
authors = [
99
{ name="CCF Team", email="[email protected]" },
1010
]

src/consensus/aft/impl/state.h

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -150,7 +150,10 @@ namespace aft
150150

151151
struct State
152152
{
153-
State(const ccf::NodeId& node_id_) : node_id(node_id_) {}
153+
State(const ccf::NodeId& node_id_, bool pre_vote_enabled_ = false) :
154+
node_id(node_id_),
155+
pre_vote_enabled(pre_vote_enabled_)
156+
{}
154157
State() = default;
155158

156159
ccf::pal::Mutex lock;
@@ -188,6 +191,8 @@ namespace aft
188191
// Index at which this node observes its retired_committed, only set when
189192
// that index itself is committed
190193
std::optional<ccf::SeqNo> retired_committed_idx = std::nullopt;
194+
195+
bool pre_vote_enabled = false;
191196
};
192197
DECLARE_JSON_TYPE_WITH_OPTIONAL_FIELDS(State);
193198
DECLARE_JSON_REQUIRED_FIELDS(
@@ -197,7 +202,8 @@ namespace aft
197202
last_idx,
198203
commit_idx,
199204
leadership_state,
200-
membership_state);
205+
membership_state,
206+
pre_vote_enabled);
201207
DECLARE_JSON_OPTIONAL_FIELDS(
202208
State,
203209
retirement_phase,

0 commit comments

Comments
 (0)