Cross-Shard Transactions in Hyperscale-rs

⏱️ Duration: 1.5–2 hours 📊 Difficulty: Intermediate 🎯 Hyperscale-rs Specific

Learning Objectives

How atomic composability works (cross-shard)

The diagram in Transaction Flow emphasizes that cross-shard atomicity depends on correct order of communication and when state updates are applied. Below is the same flow as a flowchart (consistent with Transaction Flow):

Cross-shard atomic execution (provision-based; no 2PC coordinator)
1. State provisioning
When a block commits with cross-shard txs, each validator in the source shard produces and sends a StateProvision (signed proof with merkle inclusion proofs) to target shards—effectively a vote; the target aggregates 2f+1 to form a CommitmentProof.
2. Provision verification
ProvisionCoordinator on each node receives provisions, verifies QC signature and merkle proofs against the committed state root, and tracks quorum per required shard. When it has quorum from every required shard, it marks ProvisioningComplete.
3. Deterministic execution
With provisioned state, validators execute the tx and create an ExecutionVote (merkle root of execution results).
4. Vote aggregation
When 2f+1 voting power agrees on the same merkle root, an ExecutionCertificate is created and broadcast to remote participating shards.
5. Finalization
Validators collect ExecutionCertificates from all participating shards. When all are received, a TransactionCertificate is created. Atomicity comes from this provision + vote + certificate flow; there is no separate 2PC coordinator in hyperscale-rs.

Hyperscale-rs uses a provision-based atomic execution protocol (execution crate: five phases above). There is no 2PC coordinator—no prepare/commit/abort round. Atomicity is achieved by StateProvisions from source shards, ProvisionCoordinator checklist, then execution and certificate aggregation.

ProvisionCoordinator (the only coordinator in cross-shard flow)

In hyperscale-rs, cross-shard flow has one coordinator concept: ProvisionCoordinator (provisions crate). There is no 2PC coordinator; the code uses the five-phase provision-based protocol above.

Nuance: Why "centralized" ProvisionCoordinator? The provisions crate doc calls it centralized in the sense that on each node, one sub-state machine (ProvisionCoordinator) owns all provision tracking and verification for that node—instead of scattering that logic across execution, BFT, and mempool. There is no single network-wide coordinator; each validator runs its own ProvisionCoordinator. Pros: Single place to join provisions with remote headers, verify QC + merkle proofs, and emit ProvisioningComplete; consistent with execution crate (which registers and waits on it); livelock can query txs_with_provisions_from in one place. Cons: All provision state and verification complexity lives in one component per node; backpressure for cross-shard txs is handled by mempool (which calls has_any_verified_provisions), not inside the coordinator—so the design splits "have we got provisions?" (provisions crate) from "should we relax limits?" (mempool).

State Provision flow (same style as Transaction Flow): After a shard commits its block, each validator in that shard produces a signed proof (StateProvision) and sends it to other shards. ProvisionCoordinator on each node tracks whether it has quorum of provisions from each required shard; when complete, it emits ProvisioningComplete so execution can proceed.

Multi-shard finality in rounds: Yes — a multi-shard tx is effectively finalized in multiple rounds, one per shard that holds a part of it. Shard 0 finalizes its block (and thus its part of the tx and the outgoing receipt); each validator in Shard 0 then sends a StateProvision (attesting to that receipt/state) to Shard 1. Shard 1 can only complete its part after it has those provisions; when Shard 1’s block that applies the receipt is finalized, that part is final too. So each “round” is a block finalized on one shard. The transaction as a whole is finalized only when every such block (on every involved shard) is finalized.

Provisions / ProvisionCoordinator
1. Shard commits block
A shard finalizes its block containing (part of) a cross-shard tx.
2. Produce StateProvision
Each validator in the source shard produces a StateProvision (signed proof of the state it wrote) and sends it to the target shards.
3. ProvisionCoordinator checklist
On each node, ProvisionCoordinator (provisions crate) keeps a checklist: for this tx, do we have quorum of provisions from shard 1? From shard 2? … (required_shards = all other participating shards).
4. ProvisioningComplete
When the node has provisions from every required shard, it emits ProvisioningComplete so execution can proceed. Shard 3 cannot finish its view of the cross-shard tx until it has proofs from shards 1 and 2.

Multiple source shards: A target shard can have multiple source shards. Example: tx touches shards 0, 1, 2; shard 2’s step depends on both shard 0 and shard 1. Then required_shards for shard 2 = {0, 1}. ProvisionCoordinator on shard 2 waits until it has quorum of provisions from shard 0 and quorum from shard 1. Those provisions can arrive in parallel (in any order); it is not “first shard 0, then shard 1.” As soon as the checklist is complete (every required shard has delivered quorum), ProvisioningComplete fires.

How is the order fixed? Order is determined by a deterministic rule (e.g. by ShardGroupId): consensus_shards, required_shards, and certificate collection all use the same ordering so every node agrees. There is no separate coordinator that "drives" prepare/commit—atomicity is achieved by the provision-based protocol (phases 1–5 above).

How is order determined for composite (cross-shard) transactions? In the Radix manifest, instructions are in a fixed order (e.g. withdraw from A → split → put in staking vault → put in LP → return IOUs). The Radix Engine runs those instructions sequentially when executing; data dependencies are implicit (instruction 2 sees the result of instruction 1). For Hyperscale (consensus), the transaction is turned into a RoutableTransaction by instruction analysis (crates/types/src/transaction.rs): the manifest is walked and every NodeId read or written is collected into declared_reads and declared_writes. Those are sets (deduplicated); instruction order is not preserved at the consensus layer. Shard sets are then derived: consensus_shards = unique shards of declared_writes; provisioning_shards = shards of declared_reads that are not write shards. Both are stored in BTreeSets, so the protocol order is by ShardGroupId (numeric). So: we do not derive "Account_A shard first, then Staking_VAULT shard, then LP shard" from the manifest order—we get a deterministic order by shard ID. Prerequisites are enforced by provisions: a shard that needs remote state (reads from another shard) must receive quorum of provisions from that shard before it can complete; which shards are "required" is the set of all other participating shards. Parallelism: each shard runs BFT and execution independently; cross-shard atomicity is achieved by provisioning (quorum of StateProvisions from each required shard via ProvisionCoordinator) and then execution and certificate aggregation—no 2PC coordinator in hyperscale-rs. Refs: crates/types/src/topology.rs (consensus_shards, provisioning_shards, all_shards_for_transaction), crates/types/src/transaction.rs (analyze_instructions_v1 / analyze_instructions_v2).

See the Transaction Flow diagram for the full step-by-step from user to finality.

Example: complex Radix manifest and Hyperscale provisioning

Consider a composite transaction that splits funds from a user account into staking and liquidity:

# Simplified Radix-style manifest (conceptual)
1. CallMethod(Account_A, "withdraw", XRD, amount)     # → NodeID_Account_A, vault
2. TakeFromWorktop(XRD, amount)
3. SplitBucket(amount1, amount2)                        # worktop
4. CallMethod(StakingVault, "stake", bucket1)          # → NodeID_Staking_VAULT
5. CallMethod(LiquidityPool, "contribute", bucket2)    # → NodeID_LP_COMPONENT
6. CallMethod(Account_A, "deposit", staking_IOU)        # → NodeID_Account_A
7. CallMethod(Account_A, "deposit", lp_IOU)             # → NodeID_Account_A

Radix Engine: Runs these instructions in order. Data flow is implicit (e.g. step 4 uses the bucket from step 3; step 6–7 deposit what step 4–5 returned).

Instruction analysis (Hyperscale): The manifest is walked and every NodeId read or written is collected into declared_reads and declared_writes (crates/types/src/transaction.rs: analyze_instructions_v1 / analyze_instructions_v2). The result is sets (no duplicate NodeIds—each node appears at most once); instruction order is not preserved at the consensus layer. Assume:

Shard mapping (topology): Each NodeId maps to a shard via shard_for_node(node_id, num_shards) (crates/types/src/topology.rs). Suppose Account_A → shard 1, Staking_VAULT → shard 2, LP_COMPONENT → shard 3. Then:

Provisioning and coordination:

So: the manifest defines the logical flow (withdraw → split → stake → LP → deposit); the engine runs it sequentially; Hyperscale uses shard ID order for coordination and provisions to enforce “everyone has proof from everyone else” before completion.

Order: manifest vs protocol

As above: instruction order is not preserved at consensus; protocol order is by ShardGroupId. Prerequisites are enforced by provisions; required_shards is the set of all other participating shards (start_cross_shard_execution in crates/execution/src/state.rs).

Livelock in this codebase

Livelock here means a cycle of cross-shard dependencies that would block progress: e.g. Shard A commits TX₁ (needs state from C) and Shard C commits TX₂ (needs state from A) → A is waiting on C and C on A. The livelock crate handles this:

The simulator’s --analyze-livelocks flag reports how often such cycles occurred in a run.

Concepts in the Flow

Quiz: Cross-shard and provisioning

Answer based on the content above. Pass threshold: 70%.