ChainSync - synchronizing the Blockchain #
What is blockchain synchronization? #
Blockchain synchronization (“sync”) is a key part of a blockchain system. It handles retrieval and propagation of blocks and transactions (messages), and thus is in charge of distributed state replication. This process is security critical – problems here can be catastrophic to the operation of a blockchain.
What is ChainSync? #
ChainSync
is the protocol Filecoin uses to synchronize its blockchain. It is
specific to Filecoin’s choices in state representation and consensus rules,
but is general enough that it can serve other blockchains. ChainSync
is a
group of smaller protocols, which handle different parts of the sync process.
Terms and Concepts #
LastCheckpoint
the last hard social-consensus oriented checkpoint thatChainSync
is aware of. This consensus checkpoint defines the minimum finality, and a minimum of history to build on.ChainSync
takesLastCheckpoint
on faith, and builds on it, never switching away from its history.TargetHeads
a list ofBlockCIDs
that represent blocks at the fringe of block production. These are the newest and best blocksChainSync
knows about. They are “target” heads becauseChainSync
will try to sync to them. This list is sorted by “likelihood of being the best chain” (eg for now, simplyChainWeight
)BestTargetHead
the single best chain headBlockCID
to try to sync to. This is the first element ofTargetHeads
ChainSync Summary #
At a high level, ChainSync
does the following:
- Part 1: Verify internal state (
INIT
state below)- SHOULD verify data structures and validate local chain
- Resource expensive verification MAY be skipped at nodes’ own risk
- Part 2: Bootstrap to the network (
BOOTSTRAP
)- Step 1. Bootstrap to the network, and acquire a “secure enough” set of peers (more details below)
- Step 2. Bootstrap to the
BlockPubsub
channels - Step 3. Listen and serve on Graphsync
- Part 3: Synchronize trusted checkpoint state (
SYNC_CHECKPOINT
)- Step 1. Start with a
TrustedCheckpoint
(defaults toGenesisCheckpoint
). - Step 2. Get the block it points to, and that block’s parents
- Step 3. Graphsync the
StateTree
- Step 1. Start with a
- Part 4: Catch up to the chain (
CHAIN_CATCHUP
)- Step 1. Maintain a set of
TargetHeads
(BlockCIDs
), and select theBestTargetHead
from it - Step 2. Synchronize to the latest heads observed, validating blocks towards them (requesting intermediate points)
- Step 3. As validation progresses,
TargetHeads
andBestTargetHead
will likely change, as new blocks at the production fringe will arrive, and some target heads or paths to them may fail to validate. - Step 4. Finish when node has “caught up” with
BestTargetHead
(retrieved all the state, linked to local chain, validated all the blocks, etc).
- Step 1. Maintain a set of
- Part 5: Stay in sync, and participate in block propagation (
CHAIN_FOLLOW
)- Step 1. If security conditions change, go back to Part 4 (
CHAIN_CATCHUP
) - Step 2. Receive, validate, and propagate received
Blocks
- Step 3. Now with greater certainty of having the best chain, finalize Tipsets, and advance chain state.
- Step 1. If security conditions change, go back to Part 4 (
libp2p Network Protocols #
As a networking-heavy protocol, ChainSync
makes heavy use of libp2p
. In particular, we use three sets of protocols:
libp2p.PubSub
a family of publish/subscribe protocols to propagate recentBlocks
. The concrete protocol choice impactsChainSync
's effectiveness, efficiency, and security dramatically. For Filecoin v1.0 we will uselibp2p.Gossipsub
, a recentlibp2p
protocol that combines features and learnings from many excellent PubSub systems. In the future, Filecoin may use otherPubSub
protocols. Important Note: is entirely possible for Filecoin Nodes to run multiple versions simultaneously. That said, this specification requires that filecoin nodesMUST
connect and participate in the main channel, usinglibp2p.Gossipsub
.libp2p.PeerDiscovery
a family of discovery protocols, to learn about peers in the network. This is especially important for security because network “Bootstrap” is a difficult problem in peer-to-peer networks. The set of peers we initially connect to may completely dominate our awareness of other peers, and therefore all state. We use a union ofPeerDiscovery
protocols as each by itself is not secure or appropriate for users’ threat models. The union of these provides a pragmatic and effective solution. Discovery protocols marked as required MUST be included in implementations and will be provided by implementation teams. Protocols marked as optional MAY be provided by implementation teams but can be built independently by third parties to augment bootstrap security.libp2p.DataTransfer
a family of protocols for transfering data Filecoin Nodes must runlibp2p.Graphsync
.
More concretely, we use these protocols:
libp2p.PeerDiscovery
- (required)
libp2p.BootstrapList
a protocol that uses a persistent and user-configurable list of semi-trusted bootstrap peers. The default list includes a set of peers semi-trusted by the Filecoin Community. - (optional)
libp2p.KademliaDHT
a DHT protocol that enables random queries across the entire network - (required)
libp2p.Gossipsub
a pub/sub protocol that includes “prune peer exchange” by default, disseminating peer info as part of operation - (optional)
libp2p.PersistentPeerstore
a connectivity component that keeps persistent information about peers observed in the network throughout the lifetime of the node. This is useful because we resume and continually improve Bootstrap security. - (optional)
libp2p.DNSDiscovery
to learn about peers via DNS lookups to semi-trusted peer aggregators - (optional)
libp2p.HTTPDiscovery
to learn about peers via HTTP lookups to semi-trusted peer aggregators - (optional)
libp2p.PEX
a general use peer exchange protocol distinct from pubsub peer exchange for 1:1 adhoc peer exchange
- (required)
libp2p.PubSub
- (required)
libp2p.Gossipsub
the concretelibp2p.PubSub
protocolChainSync
uses
- (required)
libp2p.DataTransfer
- (required)
libp2p.Graphsync
the data transfer protocol nodes must support for providing blockchain and user data - (optional)
BlockSync
a blockchain data transfer protocol that can be used by some nodes
- (required)
Subcomponents #
Aside from libp2p
, ChainSync
uses or relies on the following components:
- Libraries:
ipld
data structures, selectors, and protocolsipld.GraphStore
local persistent storage forchain
datastructuresipld.Selector
a way to express requests for chain data structuresipfs.GraphSync
a general-purposeipld
datastructure syncing protocol
- Data Structures:
- Data structures in the
chain
package:Block, Tipset, Chain, Checkpoint ...
chainsync.BlockCache
a temporary cache of blocks, to constrain resource expendedchainsync.AncestryGraph
a datastructure to efficiently linkBlocks
,Tipsets
, andPartialChains
chainsync.ValidationGraph
a datastructure for efficient and secure validation ofBlocks
andTipsets
- Data structures in the
Graphsync in ChainSync #
ChainSync
is written in terms of Graphsync
. ChainSync
adds blockchain and filecoin-specific
synchronization functionality that is critical for Filecoin security.
Rate Limiting Graphsync responses (SHOULD) #
When running Graphsync, Filecoin nodes must respond to graphsync queries. Filecoin requires nodes to provide critical data structures to others, otherwise the network will not function. During ChainSync, it is in operators’ interests to provide data structures critical to validating, following, and participating in the blockchain they are on. However, this has limitations, and some level of rate limiting is critical for maintaining security in the presence of attackers who might issue large Graphsync requests to cause DOS.
We recommend the following:
- Set and enforce batch size rate limits.
Force selectors to be shaped like:
LimitedBlockIpldSelector(blockCID, BatchSize)
for a single constantBatchSize = 1000
. Nodes may push for this equilibrium by only providingBatchSize
objects in responses, even for pulls much larger thanBatchSize
. This forces subsequent pulls to be run, re-rooted appropriately, and hints at other parties that they should be requesting with thatBatchSize
. - Force all Graphsync queries for blocks to be aligned along cacheable bounderies.
In conjunction with a
BatchSize
, implementations should aim to cache the results of Graphsync queries, so that they may propagate them to others very efficiently. Aligning on certain boundaries (eg specificChainEpoch
limits) increases the likelihood many parties in the network will request the same batches of content. Another good cacheable boundary is the entire contents of aBlock
(BlockHeader
,Messages
,Signatures
, etc). - Maintain per-peer rate-limits. Use bandwidth usage to decide whether to respond and how much on a per-peer basis. Libp2p already tracks bandwidth usage in each connection. This information can be used to impose rate limits in Graphsync and other Filecoin protocols.
- Detect and react to DOS: restrict operation.
The safest implementations will likely detect and react to DOS attacks. Reactions could include:
- Smaller
Graphsync.BatchSize
limits - Fewer connections to other peers
- Rate limit total Graphsync bandwidth
- Assign Graphsync bandwidth based on a peer priority queue
- Disconnect from and do not accept connections from unknown peers
- Introspect Graphsync requests and filter/deny/rate limit suspicious ones
- Smaller
Previous BlockSync protocol #
Prior versions of this spec recommended a BlockSync
protocol. This protocol definition is
available here.
Filecoin nodes are libp2p nodes, and therefore may run a variety
of other protocols, including this BlockSync
protocol. As with anything else in Filecoin, nodes
MAY opt to use additional protocols to achieve the results.
That said, Nodes MUST implement the version of ChainSync
as described in this spec in order to
be considered implementations of Filecoin. Test suites will assume this protocol.
ChainSync State Machine #
ChainSync
uses the following conceptual state machine. Since this is a conceptual state machine,
implementations MAY deviate from implementing precisely these states, or dividing them strictly.
Implementations MAY blur the lines between the states. If so, implementations MUST ensure security
of the altered protocol.
ChainSync FSM: INIT
#
- beginning state. no network connections, not synchronizing.
- local state is loaded: internal data structures (eg chain, cache) are loaded
LastTrustedCheckpoint
is set the latest network-wide acceptedTrustedCheckpoint
FinalityTipset
is set to finality achieved in a prior protocol run.- Default: If no later
FinalityTipset
has been achieved, setFinalityTipset
toLastTrustedCheckpoint
- Default: If no later
- Chain State and Finality:
- In this state, the chain MUST NOT advance beyond whatever the node already has.
- No new blocks are reported to consumers.
- The chain state provided is whatever was loaded from prior executions (worst case is
LastTrustedCheckpoint
)
- security conditions to transition out:
- local state and data structures SHOULD be verified to be correct
- this means validating any parts of the chain or
StateTree
the node has, fromLastTrustedCheckpoint
on.
- this means validating any parts of the chain or
LastTrustedCheckpoint
is well-known across the Filecoin Network to be a trueTrustedCheckpoint
- this SHOULD NOT be verified in software, it SHOULD be verified by operators
- Note: we ALWAYS have at least one
TrustedCheckpoint
, theGenesisCheckpoint
.
- local state and data structures SHOULD be verified to be correct
- transitions out:
- once done verifying things: move to
BOOTSTRAP
- once done verifying things: move to
ChainSync FSM: BOOTSTRAP
#
network.Bootstrap()
: establish connections to peers until we satisfy security requirement- for better security, use many different
libp2p.PeerDiscovery
protocols
- for better security, use many different
BlockPubsub.Bootstrap()
: establish connections toBlockPubsub
peers- The subscription is for both peer discovery and to start selecting best heads. Listing on pubsub from the start keeps the node informed about potential head changes.
Graphsync.Serve()
: set up a Graphsync service, that responds to others’ queries- Chain State and Finality:
- In this state, the chain MUST NOT advance beyond whatever the node already has.
- No new blocks are reported to consumers.
- The chain state provided is whatever was loaded from prior executions (worst case is
LastTrustedCheckpoint
).
- security conditions to transition out:
Network
connectivity MUST have reached the security level acceptable forChainSync
BlockPubsub
connectivity MUST have reached the security level acceptable forChainSync
- “on time” blocks MUST be arriving through
BlockPubsub
- transitions out:
- once bootstrap is deemed secure enough:
- if node does not have the
Blocks
orStateTree
corresponding toLastTrustedCheckpoint
: move toSYNC_CHECKPOINT
- otherwise: move to
CHAIN_CATCHUP
- if node does not have the
- once bootstrap is deemed secure enough:
ChainSync FSM: SYNC_CHECKPOINT
#
- While in this state:
ChainSync
is well-bootstrapped, but does not yet have theBlocks
orStateTree
forLastTrustedCheckpoint
ChainSync
issuesGraphsync
requests to its peers randomly for theBlocks
andStateTree
forLastTrustedCheckpoint
:ChainSync
's counterparts in other peers MUST provide the state tree.- It is only semi-rational to do so, so
ChainSync
may have to try many peers. - Some of these requests MAY fail.
- Chain State and Finality:
- In this state, the chain MUST NOT advance beyond whatever the node already has.
- No new blocks are reported to consumers.
- The chain state provided is the available
Blocks
andStateTree
forLastTrustedCheckpoint
.
- Important Notes:
ChainSync
needs to fetch several blocks: theBlock
pointed at byLastTrustedCheckpoint
, and its directBlock.Parents
.- Nodes only need hashing to validate these
Blocks
andStateTrees
– no block validation or state machine computation is needed. - The initial value of
LastTrustedCheckpoint
isGenesisCheckpoint
, but it MAY be a value later in Chain history. LastTrustedCheckpoint
enables efficient syncing by making the implicit economic consensus of chain history explicit.- By allowing fetching of the
StateTree
ofLastTrustedCheckpoint
viaGraphsync
,ChainSync
can yield much more efficient syncing than comparable blockchain synchronization protocols, as syncing and validation can start there. - Nodes DO NOT need to validate the chain from
GenesisCheckpoint
.LastTrustedCheckpoint
MAY be a value later in Chain history. - Nodes DO NOT need to but MAY sync earlier
StateTrees
thanLastTrustedCheckpoint
as well.
- Pseudocode 1: a basic version of
SYNC_CHECKPOINT
:func (c *ChainSync) SyncCheckpoint() { while !c.HasCompleteStateTreeFor(c.LastTrustedCheckpoint) { selector := ipldselector.SelectAll(c.LastTrustedCheckpoint) c.Graphsync.Pull(c.Peers, sel, c.IpldStore) // Pull SHOULD NOT pull what c.IpldStore already has (check first) // Pull SHOULD pull from different peers simultaneously // Pull SHOULD be efficient (try different parts of the tree from many peers) // Graphsync implementations may not offer these features. These features // can be implemented on top of a graphsync that only pulls from a single // peer and does not check local store first. } c.ChainCatchup() // on to CHAIN_CATCHUP }
- security conditions to transition out:
StateTree
forLastTrustedCheckpoint
MUST be stored locally and verified (hashing is enough)
- transitions out:
- once node receives and verifies complete
StateTree
forLastTrustedCheckpoint
: move toCHAIN_CATCHUP
- once node receives and verifies complete
ChainSync FSM: CHAIN_CATCHUP
#
- While in this state:
ChainSync
is well-bootstrapped, and has an initial trustedStateTree
to start from.ChainSync
is receiving latestBlocks
fromBlockPubsub
ChainSync
starts fetching and validating blocksChainSync
has unvalidated blocks betweenChainSync.FinalityTipset
andChainSync.TargetHeads
- Chain State and Finality:
- In this state, the chain MUST NOT advance beyond whatever the node already has:
FinalityTipset
does not change.- No new blocks are reported to consumers/users of
ChainSync
yet. - The chain state provided is the available
Blocks
andStateTree
for all available epochs, specially theFinalityTipset
. - Finality must not move forward here because there are serious attack vectors where a node can be forced to end up on the wrong fork if finality advances before validation is complete up to the block production fringe.
- Validation must advance, all the way to the block production fringe:
- Validate the whole chain, from
FinalityTipset
toBestTargetHead
- The node can reach
BestTargetHead
only to find out it was invalid, then has to updateBestTargetHead
with next best one, and sync to it (without having advancedFinalityTipset
yet, as otherwise we may end up on the wrong fork)
- Validate the whole chain, from
- In this state, the chain MUST NOT advance beyond whatever the node already has:
- security conditions to transition out:
- Gaps between
ChainSync.FinalityTipset ... ChainSync.BestTargetHead
have been closed:- All
Blocks
and their content MUST be fetched, stored, linked, and validated locally. This includesBlockHeaders
,Messages
, etc. - Bad heads have been expunged from
ChainSync.TargetHeads
. Bad heads include heads that initially seemed good but turned out invalid, or heads thatChainSync
has failed to connect (ie. cannot fetch ancestors connecting back toChainSync.FinalityTipset
within a reasonable amount of time). - All blocks between
ChainSync.FinalityTipset ... ChainSync.TargetHeads
have been validated This means all blocks before the best heads.
- All
- Not under a temporary network partition
- Gaps between
- transitions out:
- once gaps between
ChainSync.FinalityTipset ... ChainSync.TargetHeads
are closed: move toCHAIN_FOLLOW
- (Perhaps moving to
CHAIN_FOLLOW
when 1-2 blocks back in validation may be ok.- we dont know we have the right head until we validate it, so if other heads of similar height are right/better, we won’t know until then.)
- once gaps between
ChainSync FSM: CHAIN_FOLLOW
#
- While in this state:
ChainSync
is well-bootstrapped, and has an initial trustedStateTree
to start from.ChainSync
fetches and validates blocks.ChainSync
is receiving and validating latestBlocks
fromBlockPubsub
ChainSync
DOES NOT have unvalidated blocks betweenChainSync.FinalityTipset
andChainSync.TargetHeads
ChainSync
MUST drop back to another state if security conditions change.- Keep a set of gap measures:
BlockGap
is the number of remaining blocks to validate between the Validated blocks andBestTargetHead
.- (ie how many epochs do we need to validate to have validated
BestTargetHead
, does not include null blocks)
- (ie how many epochs do we need to validate to have validated
EpochGap
is the number of epochs between the latest validated block, andBestTargetHead
(includes null blocks).MaxBlockGap = 2
, which means how many blocks mayChainSync
fall behind on before switching back toCHAIN_CATCHUP
(does not include null blocks)MaxEpochGap = 10
, which means how many epochs mayChainSync
fall behind on before switching back toCHAIN_CATCHUP
(includes null blocks)
- Chain State and Finality:
- In this state, the chain MUST advance as all the blocks up to
BestTargetHead
are validated. - New blocks are finalized as they cross the finality threshold (
ValidG.Heads[0].ChainEpoch - FinalityLookback
) - New finalized blocks are reported to consumers.
- The chain state provided includes the
Blocks
andStateTree
for theFinality
epoch, as well as candidateBlocks
andStateTrees
for unfinalized epochs.
- In this state, the chain MUST advance as all the blocks up to
- security conditions to transition out:
- Temporary network partitions (see Detecting Network Partitions).
- Encounter gaps of
>MaxBlockGap
or>MaxEpochGap
between Validated set and a newChainSync.BestTargetHead
- transitions out:
- if a temporary network partition is detected: move to
CHAIN_CATCHUP
- if
BlockGap > MaxBlockGap
: move toCHAIN_CATCHUP
- if
EpochGap > MaxEpochGap
: move toCHAIN_CATCHUP
- if node is shut down: move to
INIT
- if a temporary network partition is detected: move to
Block Fetching, Validation, and Propagation #
Notes on changing TargetHeads
while syncing
#
TargetHeads
is changing, asChainSync
must be aware of the best heads at any time. reorgs happen, and our first set of peers could’ve been bad, we keep discovering others.- Hello protocol is good, but it’s polling. Unless node is constantly polllng, won’t see all the heads.
BlockPubsub
gives us the realtime view into what’s actually going on.- Weight can also be close between 2+ possible chains (long-forked), and
ChainSync
must select the right one (which, we may not be able to distinguish until validating all the way)
- fetching + validation are strictly faster per round on average than blocks produced/block time (if they’re not, will always fall behind), so we definitely catch up eventually (and even quickly). The last couple rounds can be close (“almost got it, almost got it, there”).
General notes on fetching Blocks #
ChainSync
selects and maintains a set of the most likely heads to be correct from among those received viaBlockPubsub
. As more blocks are received, the set ofTargetHeads
is reevaluated.ChainSync
fetchesBlocks
,Messages
, andStateTree
through theGraphsync
protocol.ChainSync
maintains sets ofBlocks/Tipsets
inGraphs
(seeChainSync.id
)ChainSync
gathers a list ofTargetHeads
fromBlockPubsub
, sorted by likelihood of being the best chain (see below).ChainSync
makes requests for chains ofBlockHeaders
to close gaps betweenTargetHeads
ChainSync
forms partial unvalidated chains ofBlockHeaders
, from those received viaBlockPubsub
, and those requested viaGraphsync
.ChainSync
attempts to form fully connected chains ofBlockHeaders
, parting fromStateTree
, toward observedHeads
ChainSync
minimizes resource expenditures to fetch and validate blocks, to protect against DOS attack vectors.ChainSync
employs Progressive Block Validation, validating different facets at different stages of syncing.ChainSync
delays syncingMessages
until they are needed. Much of the structure of the partial chains can be checked and used to make syncing decisions without fetching theMessages
.
Progressive Block Validation #
Blocks may be validated in progressive stages, in order to minimize resource expenditure.
Validation computation is considerable, and a serious DOS attack vector.
Secure implementations must carefully schedule validation and minimize the work done by pruning blocks without validating them fully.
ChainSync
SHOULD keep a cache of unvalidated blocks (ideally sorted by likelihood of belonging to the chain), and delete unvalidated blocks when they are passed byFinalityTipset
, or whenChainSync
is under significant resource load.These stages can be used partially across many blocks in a candidate chain, in order to prune out clearly bad blocks long before actually doing the expensive validation work.
Progressive Stages of Block Validation
- BV0 - Syntax: Serialization, typing, value ranges.
- BV1 - Plausible Consensus: Plausible miner, weight, and epoch values (e.g from chain state at
b.ChainEpoch - consensus.LookbackParameter
). - BV2 - Block Signature
- BV3 - ElectionPoSt: Correct PoSt with a winning ticket.
- BV4 - Chain ancestry and finality: Verify block links back to trusted chain, not prior to finality.
- BV4 - Message Signatures:
- BV5 - State tree: Parent tipset message execution produces the claimed state tree root and receipts.
Notes:
- in
CHAIN_CATCHUP
, if a node is receiving/fetching hundreds/thousands ofBlockHeaders
, validating signatures can be very expensive, and can be deferred in favor of other validation. (ie lots of BlockHeaders coming in through network pipe, don’t want to bound on sig verification, other checks can help dump blocks on the floor faster (BV0, BV2)- in
CHAIN_FOLLOW
, we’re not receiving thousands, we’re receiving maybe a dozen or 2 dozen packets in a few seconds. We receive cid w/ Sig and addr first (ideally fits in 1 packet), and can afford to (a) check if we already have the cid (if so done, cheap), or (b) if not, check if sig is correct before fetching header (expensive computation, but checking 1 sig is way faster than checking a ton). In practice likely that which one to do is dependent on miner tradeoffs. we’ll recommend something but let miners decide, because one strat or the other may be much more effective depending on their hardware, on their bandwidth limitations, or their propensity to getting DOSed
Progressive Block Propagation (or BlockSend) #
- In order to make Block propagation more efficient, we trade off network round trips for bandwidth usage.
- Motivating observations:
- Block propagation is one of the most security critical points of the whole protocol.
- Bandwidth usage during Block propagation is the biggest rate limiter for network scalability.
- The time it takes for a Block to propagate to the whole network is a critical factor in determining a secure
BlockTime
- Blocks propagating through the network should take as few sequential roundtrips as possible, as these roundtrips impose serious block time delays. However, interleaved roundtrips may be fine. Meaning that
block.CIDs
may be propagated on their own, without the header, then the header without the messages, then the messages. Blocks
will propagate over alibp2p.PubSub
.libp2p.PubSub.Messages
will most likely arrive multiple times at a node. Therefore, using only theblock.CID
here could make this very cheap in bandwidth (more expensive in round trips)Blocks
in a single epoch may include the sameMessages
, and duplicate transfers can be avoidedMessages
propagate through their ownMessagePubsub
, and nodes have a significant probability of already having a large fraction of the messages in a block. Since messages are the bulk of the size of aBlock
, this can present great bandwidth savings.
- Progressive Steps of Block Propagation
- IMPORTANT NOTES:
- These can be effectively pipelined. The
receiver
is in control of what to pull, and when. It is up them to decide when to trade-off RTTs for Bandwidth. - If the
sender
is propagating the block at all toreceiver
, it is in their interest to provide the full content toreceiver
when asked. Otherwise the block may not get included at all. - Lots of security assumptions here – this needs to be hyper verified, in both spec and code.
sender
is a filecoin node runningChainSync
, propagating a block via Gossipsub (as the originator, as another peer in the network, or just a Gossipsub router).receiver
is the local filecoin node runningChainSync
, trying to get the blocks.- For
receiver
toPull
things fromsender
,receiver
must conntect tosender
. Usuallysender
is sending toreceiver
because of the Gossipsub propagation rules.receiver
could choose toPull
from any other node they are connected to, but it is most likelysender
will have the needed information. They usually will be more well-connected in the network.
- These can be effectively pipelined. The
- Step 1. (sender)
Push BlockHeader
:sender
sendsblock.BlockHeader
toreceiver
via Gossipsub:bh := Gossipsub.Send(h block.BlockHeader)
- This is a light-ish object (<4KB).
receiver
receivesbh
.- This has many fields that can be validated before pulling the messages. (See Progressive Block Validation).
- BV0, BV1, BV2, and BV3 validation takes place before propagating
bh
to other nodes. receiver
MAY receive many advertisements for each winning block in an epoch in quick succession. This is because (a) many want propagation as fast as possible, (b) many want to make those network advertisements as light as reasonable, (c) we want to enablereceiver
to choose who to ask it from (usually the first party to advertise it, and that’s what spec will recommend), and (d) want to be able to fall back to asking others if that fails (fail = dont get it in 1s or so)
- Step 2. (receiver)
Pull MessageCids
:- upon receiving
bh
,receiver
checks whether it already has the full block forbh.BlockCID
. if not:receiver
requestsbh.MessageCids
fromsender
:bm := Graphsync.Pull(sender, SelectAMTCIDs(b.Messages))
- upon receiving
- Step 3. (receiver)
Pull Messages
:- If
receiver
DOES NOT already have the all messages forb.BlockCID
, then:- If
receiver
has some of the messages:receiver
requests missingMessages
fromsender
:Graphsync.Pull(sender, SelectAll(bm[3], bm[10], bm[50], ...))
orfor m in bm { Graphsync.Pull(sender, SelectAll(m)) }
- If
receiver
does not have any of the messages (default safe but expensive thing to do):receiver
requests allMessages
fromsender
:Graphsync.Pull(sender, SelectAll(bh.Messages))
- (This is the largest amount of stuff)
- If
- If
- Step 4. (receiver)
Validate Block
:- The only remaining thing to do is to complete Block Validation.
- IMPORTANT NOTES:
Calculations #
Security Parameters #
Peers
>= 32 – direct connections- ideally
Peers
>= {64, 128}
- ideally
Pubsub Bandwidth #
These bandwidth calculations are used to motivate choices in ChainSync
.
If you imagine that you will receive the header once per gossipsub peer (or if lucky, half of them), and that there is EC.E_LEADERS=10 blocks per round, then we’re talking the difference between:
16 peers, 1 pkt -- 1 * 16 * 10 = 160 dup pkts (256KB) in <5s
16 peers, 4 pkts -- 4 * 16 * 10 = 640 dup pkts (1MB) in <5s
32 peers, 1 pkt -- 1 * 32 * 10 = 320 dup pkts (512KB) in <5s
32 peers, 4 pkts -- 4 * 32 * 10 = 1,280 dup pkts (2MB) in <5s
64 peers, 1 pkt -- 1 * 32 * 10 = 320 dup pkts (1MB) in <5s
64 peers, 4 pkts -- 4 * 32 * 10 = 1,280 dup pkts (4MB) in <5s
2MB in <5s may not be worth saving– and maybe gossipsub can be much better about supressing dups.
Notes (TODO: move elsewhere) #
Checkpoints #
- A checkpoint is the CID of a block (not a tipset list of CIDs, or StateTree)
- The reason a block is OK is that it uniquely identifies a tipset.
- using tipsets directly would make Checkpoints harder to communicate. We want to make checkpoints a single hash, as short as we can have it. They will be shared in tweets, URLs, emails, printed into newspapers, etc. Compactness, ease of copy-paste, etc matters.
- We’ll make human readable lists of checkpoints, and making “lists of lists” is more annoying.
- When we have
EC.E_PARENTS > 5
or= 10
, tipsets will get annoyingly large. - The big quirk/weirdness with blocks it that it also must be in the chain. (If you relaxed that constraint you could end up in a weird case where a checkpoint isn’t in the chain and that’s weird/violates assumptions.)
Bootstrap chain stub #
- The mainnet filecoin chain will need to start with a small chain stub of blocks.
- We must include some data in different blocks.
- We do need a genesis block – we derive randomness from the ticket there. Rather than special casing, it is easier/less complex to ensure a well-formed chain always, including at the beginning
- A lot of code expects lookbacks, especially actor code. Rather than introducing a bunch of special case logic for what happens ostensibly once in network history (special case logic which adds complexity and likelihood of problems), it is easiest to assume the chain is always at least X blocks long, and the system lookback parameters are all fine and dont need to be scaled in the beginning of network’s history.
PartialGraph #
The PartialGraph
of blocks.
Is a graph necessarily connected, or is this just a bag of blocks, with each disconnected subgraph being reported in heads/tails?
The latter. The partial graph is a DAG fragment– including disconnected components. here’s a visual example, 4 example PartialGraphs, with Heads and Tails. (note they aren’t tipsets)