Learning Objectives
- Understand the consensus problem in distributed systems
- Learn about Byzantine Fault Tolerance (BFT)
- Understand quorum certificates and voting
- Learn the basics of HotStuff-2 protocol
- Compare BFT with Proof-of-Stake (PoS)
1. The Consensus Problem
What is Consensus?
In a distributed system, consensus is the process of getting all honest nodes to agree on a single value or sequence of values, even when some nodes may be faulty or malicious.
Example: Imagine 4 generals trying to coordinate an attack. One might be a traitor. How do the honest generals agree on a plan?
Why is Consensus Hard?
- Network delays - Messages arrive at different times
- Node failures - Nodes can crash or go offline
- Byzantine faults - Nodes can lie, send conflicting messages, or behave arbitrarily
- Network partitions - Network can split into isolated groups
The FLP Impossibility Result
Fischer, Lynch, and Paterson proved that in an asynchronous network, it's impossible to guarantee consensus if even one node can fail. This is why blockchain systems assume something stronger than full asynchrony (e.g. partial synchrony). For why BFT can still guarantee progress, see Distributed Systems Fundamentals (partial synchrony, GST).
2. Byzantine Fault Tolerance (BFT)
What is BFT?
Byzantine Fault Tolerance is the ability of a system to function correctly even when some nodes are Byzantine (malicious or faulty).
Fault Model
In a BFT system with n = 3f + 1 nodes:
- Up to f nodes can be Byzantine (malicious)
- At least 2f + 1 nodes must be honest
- This ensures any two quorums (groups of 2f+1) must overlap in at least one honest node
Example: n = 4, f = 1
- Total nodes: 4
- Byzantine nodes: up to 1
- Honest nodes: at least 3
- Quorum size: 2f + 1 = 3
Any two groups of 3 nodes must share at least 1 honest node.
BFT vs Crash Fault Tolerance (CFT)
| Aspect |
CFT |
BFT |
| Fault Model |
Nodes crash |
Nodes can be malicious |
| Tolerance |
f out of 2f+1 |
f out of 3f+1 |
| Use Case |
Databases, distributed systems |
Blockchains, critical systems |
3. Quorum Certificates (QCs)
What is a Quorum Certificate?
A Quorum Certificate (QC) is cryptographic proof that at least 2f+1 validators have voted for a block. It's an aggregated signature from a quorum of validators.
Properties of QCs
- Cryptographic proof - Cannot be forged
- Compact - Single aggregated signature (not 2f+1 individual signatures)
- Verifiable - Anyone can verify the QC
- Binding - Proves agreement on a specific block
How a QC is Formed – Steps
In HotStuff-style BFT, a quorum certificate is produced and then carried in the next block’s header. Conceptually:
- (i) The proposer for block H builds the block (including
parent_qc = QC for H−1 in the header) and broadcasts the block header to the shard.
- (ii) Validators receive the header, validate it, and each signs a vote (block hash + metadata). Votes are broadcast to the shard (so every validator, including the next proposer, receives them).
- (iii) When any validator has 2f+1 valid votes for block H, it requests aggregation into a single QC (verify signatures and build aggregated signature). That node’s state stores the new QC as its
latest_qc.
- (iv) The QC is not sent as a separate network message. The next proposer (for H+1) either formed QC(H) locally from the broadcast votes, or later sees QC(H) when it receives a block whose header contains it.
- (v) When the next proposer builds block H+1, it sets the new header’s
parent_qc = QC(H) (from its latest_qc) and broadcasts block H+1. All validators then see QC(H) inside the header of H+1.
So: votes are broadcast → (any) validator with 2f+1 votes forms the QC locally → QC is carried in the next block’s header.
Communication flow (matches (i)–(v) above)
↓
(ii) Validators receive & validate; votes broadcast to shard
↓
(iii) When 2f+1 votes: request QC build → latest_qc updated
↓
Key point: The QC is not broadcast as a separate message. Votes are broadcast; validators with 2f+1 votes form the QC locally; the next block’s header carries that QC as parent_qc, so everyone sees it when they receive the next block.
Example timeline
T1: Proposer builds Block 10 (header includes QC for Block 9), broadcasts header
T2: Validators validate, sign votes, broadcast votes to shard
T3: When 2f+1 votes received → any such validator requests QC build → QC(10) formed, stored as latest_qc
T4: Next proposer builds Block 11 with parent_qc = QC(10), broadcasts Block 11
T5: All nodes receive Block 11 → see QC(10) in header
In hyperscale-rs (crate and line references)
- (i) Block header carries
parent_qc: crates/types/src/block.rs (struct BlockHeader, field parent_qc at line 75).
- (ii) Votes are broadcast to the shard:
Action::PersistAndBroadcastVote in crates/core/src/action.rs (lines 444–449). BFT state emits it when a validator votes: crates/bft/src/state.rs (around line 2575).
- (iii) When 2f+1 votes are collected, state requests
Action::VerifyAndBuildQuorumCertificate: crates/core/src/action.rs (lines 113–132), emitted from crates/bft/src/state.rs (around line 2765). Runner performs verification/aggregation; BFT’s vote_set builds the QC: crates/bft/src/vote_set.rs (build_qc at line 307). When the QC is ready, on_qc_formed in crates/bft/src/state.rs sets self.latest_qc = Some(qc) (line 3583).
- (iv) No separate QC message: validators (including the next proposer) either form the QC from broadcast votes or later set
latest_qc from a received block header: crates/bft/src/state.rs (e.g. self.latest_qc = Some(header.parent_qc.clone()) at line 1597).
- (v) Building the next block uses
self.latest_qc as parent_qc: crates/bft/src/state.rs (e.g. lines 1013–1017, 1147, 1182–1208, 1296–1323).
References are to the hyperscale-rs repo (paths and line numbers may shift slightly with upstream).
4. HotStuff-2 Protocol Basics
What is HotStuff-2?
HotStuff-2 is a BFT consensus protocol that uses a two-chain commit rule. It's an improvement over traditional BFT protocols like PBFT.
What is PBFT?
PBFT (Practical Byzantine Fault Tolerance) is a classic BFT consensus protocol introduced by Castro and Liskov in 1999. It was one of the first practical BFT protocols that could handle Byzantine faults in asynchronous networks.
PBFT Characteristics:
- Three-phase commit - Requires three rounds of communication (pre-prepare, prepare, commit)
- Three-chain commit rule - A block only commits after three consecutive blocks have QCs (quorum certificates)
- Coordinated view changes - When the leader is suspected of being faulty, all nodes coordinate to vote on a new leader/view
- Quadratic communication - O(n²) messages per block (each node sends messages to all other nodes)
- Explicit view-change protocol - Requires a separate voting phase to change views/leaders
Why PBFT matters: PBFT established the foundation for BFT consensus. Many modern protocols (including HotStuff-2) are improvements on PBFT's core ideas, addressing its limitations around communication complexity and view changes.
Key Features of HotStuff-2
- Two-chain commit - Block commits when next block gets QC (vs 3-chain in PBFT)
What this means: In PBFT, you need blocks at heights H, H+1, and H+2 to all have QCs before H commits (three-chain commit). In HotStuff-2, you only need H and H+1 to have QCs. This reduces latency from 3 blocks to 2 blocks.
- Implicit view changes - No coordinated view-change voting
What "coordinated view-change voting" means: In
PBFT, when nodes suspect the leader is faulty, they must coordinate a
view-change. This requires:
- All nodes detecting the problem
- A voting phase where nodes vote to change the view
- Agreement on the new leader
- State synchronization before resuming normal operation
This is complex and can cause delays. HotStuff-2 uses
implicit view changes - if a block doesn't get a
QC within a timeout, nodes automatically advance to the next view/leader without explicit voting. The next leader simply proposes the next block, and if it gets a
QC, the
view change is implicitly accepted.
- Optimistic pipelining - Multiple blocks can be in-flight
- Linear communication - O(n) messages per block (vs O(n²) in PBFT)
Communication complexity: In PBFT, each node sends messages to all other nodes, resulting in n × n = n² messages. HotStuff-2 uses a leader-based approach where nodes only send messages to the leader, resulting in n messages per block.
Basic Flow
- Proposal: Proposer creates and broadcasts block header
- Voting: Validators validate and vote
- QC Formation: When 2f+1 votes collected, QC is formed
- Commit: Block commits when next block gets QC (two-chain rule)
Two-Chain Commit Rule
Block at height H commits when block at height H+1 gets a QC.
Rule: Block H commits when block H+1 receives a QC.
Example scenario:
1. Block H is proposed and receives QC
2. Block H+1 is proposed and receives QC
→ Block H NOW COMMITS (because H+1 has QC)
3. Block H+2 is proposed and receives QC
→ Block H+1 NOW COMMITS (because H+2 has QC)
Visual representation:
Block H: [has QC] → commits when H+1 gets QC ✓
Block H+1: [has QC] → commits when H+2 gets QC (waiting...)
Block H+2: [no QC yet] → will commit when H+3 gets QC (waiting...)
The "two-chain" name comes from needing 2 consecutive blocks with QCs:
- H has QC and H+1 has QC → H commits (2 blocks in chain)
Why Two-Chain? Ensures
finality even under network partitions. Only one chain can get consecutive QCs.
Understanding Finality:
Finality means once a block is committed, it cannot be reverted. In two-chain commit: Block H commits when H+1 gets a QC, creating a chain of QCs proving agreement by 2f+1 validators.
Under network partitions, only the majority partition (2f+1 validators) can produce consecutive QCs. The minority cannot, so their blocks won't commit. This ensures safety: only one valid chain exists, and committed blocks are permanent.
5. What PoW and BFT Mean (and How They Relate)
BFT and Proof of Work (PoW) answer different questions:
- BFT is a consensus approach: how do nodes agree on the next block? In BFT, a fixed (or known) set of validators vote; once a quorum agrees, the block is committed. So BFT = agreement by voting among identified participants.
- PoW is a participation / block-producer selection mechanism: who gets to propose the next block? In PoW, miners compete by solving a puzzle; the winner proposes. The chain then uses the longest-chain rule (Nakamoto consensus), not BFT voting. So PoW = who may extend the chain, plus probabilistic agreement via “longest chain wins.”
How they relate: BFT and PoW are on different axes. BFT is about the agreement rule (voting, quorums, deterministic finality). PoW (and PoS) are about who participates or who gets to propose. Many chains use BFT for agreement and PoS for validator selection (e.g. Tendermint: BFT consensus + PoS to choose the validator set). PoW chains typically do not use BFT; they use Nakamoto (longest-chain) consensus with probabilistic finality.
BFT vs Proof-of-Stake (PoS)
Below compares classic BFT (fixed validator set, as in Hyperscale-rs) with open PoS (anyone can join by staking). Many chains combine both (e.g. Tendermint: BFT consensus + PoS for validator selection).
Key Differences
| Aspect |
BFT (classic) |
PoS (open) |
| Validator Set |
Fixed, known |
Dynamic, stake-weighted |
| Finality |
Immediate, deterministic |
Probabilistic or economic; often requires time |
| Communication |
O(n) per block |
Often O(n) among validators (when BFT-style) |
| Fault Tolerance |
Up to f of 3f+1 Byzantine (≤1/3) |
Honest majority of stake (<50% adversarial) |
Comparison with Proof of Work (PoW): PoW does not use BFT: it uses the longest-chain rule (Nakamoto consensus). There is no fixed validator set (any miner can participate), only probabilistic finality (confirmations over time), and security relies on honest majority of hashrate (attack requires >50% hashrate). BFT gives deterministic finality and provable safety with ≤1/3 Byzantine nodes.
When to Use BFT
- Need immediate finality
- Fixed validator set is acceptable
- High throughput is critical
- Deterministic safety guarantees needed
8. Practical Assignment
Assignment: Understanding BFT Consensus
Tasks:
- Read protocol docs and code in the repo:
- Trace a block through consensus:
- Open crates/bft/src/state.rs in the repo
- Find the block proposal function
- Find the vote handling function
- Find the QC formation logic
- Find the commit logic
- Create a diagram showing the flow
- Answer Questions:
- In a system with 7 validators, how many can be Byzantine?
- How many votes are needed for a quorum in a 7-validator system?
- When does block H commit in HotStuff-2?
- What is the difference between a vote and a QC?
- Update Learning Journal:
- Add a section on BFT consensus
- Document what you learned
- List questions you still have
Success Criteria:
- ✅ You can explain BFT to someone else
- ✅ You understand quorum certificates
- ✅ You can trace through the code
- ✅ You can answer the questions correctly