title |
---|
Warp Sync Snapshot Format |
Warp sync extends previous versions of the protocol with full state snapshots. These snapshots can be used to quickly get a full copy of the state at a given block. Every 5,000 blocks, nodes will take a consensus-critical snapshot of that block's state. Any node can fetch these snapshots over the network, enabling a fast sync. These snapshots have been designed with out-of-order restoration in mind -- it isn't required to get any given chunk before another.
The snapshot format is three-part: block chunks, state chunks, and the manifest.
Every chunk is run through snappy compression, and then hashed. Before compression, chunks are created to be approximately CHUNK_SIZE
bytes.
CHUNK_SIZE
= 4MiB
, but this may be subject to change.
All data structures should be assumed to be RLP-encoded unless specified otherwise.
This contains metadata about the snapshot itself, and is used to coordinate snapshots between nodes. Every snapshot will have a unique manifest, so two identical manifests will refer to the same snapshot.
The manifest is an rlp-encoded list of the following format:
[
version: P, // snapshot format version. Must be set to 2.
state_hashes: [hash_1: B_32, hash_2: B_32, ...], // a list of all the state chunks in this snapshot
block_hashes: [hash_1: B_32, hash_2: B_32, ...], // a list of all the block chunks in this snapshot
state_root: B_32, // the root which the rebuilt state trie should have. used to ensure validity
block_number: P, // the number of the best block in the snapshot; the one which the state coordinates to.
block_hash: B_32, // the best block in the snapshot's hash.
]
The chunk hashes here are not the hashes of the raw chunk data, but rather of their snappy-compressed data. This is a canonical identifier for any given chunk.
NOTE: This is the block chunk format for Ethash PoW-based engines. Other consensus engines may have their own block chunk format.
Block chunks contain raw block data: blocks themselves, and their transaction receipts. The blocks are stored in the "abridged block" format (referred to by AB
), and the the receipts are stored in a list: [receipt_1: P, receipt_2: P, ...]
(referred to by RC
).
Each block chunk is an rlp-encoded list of the following format:
[
number: P, // number of the first block in the chunk
hash: B_32, // hash of the first block in the chunk
td: B_32, // total difficulty of the first block in the chunk
[abridged_1: AB, receipts_1: RC], // The abridged RLP for the first block, and its receipts.
[abridged_2: AB, receipts_2: RC], // The abridged RLP for the second block, and its receipts.
[abridged_3: AB, receipts_3: RC], // ... and so on.
...
]
The blocks within must be sequential, so that their abridged RLP can be assembled into entire blocks simply using the metadata at the beginning.
This differs from the standard block RLP format by omitting some fields which can be reproduced from the metadata at the beginning of the block chunk.
An abridged block takes the following form:
[
// some header fields
author: B_20,
state_root: B_32,
log_bloom: B_256,
difficulty: B_32,
gas_limit: B_32,
gas_used: B_32,
timestamp: P,
extra_data: P,
// uncles and transactions inline
[tx_1: P, tx_2: P, ...],
[uncle_1: P, uncle_2: P, ...],
// Seal fields
mix_hash: B_32,
nonce: B_8,
]
Definitions:
Let NUM(X) denote the number of block X.
Let HASH(X) denote the hash of block X.
Let TD(X) denote the total difficulty of block X.
Algorithm:
Let B be the most recent block.
Let Btarget = B.
Let Bfinish = B - 30000 + 1
.
Let Scurrent = 0.
Walk backwards from block B until reaching Bfinish .
For each block Bx:
- generate the list
[abridged: AB, receipts: RC]
Dx - Note its size: Sx = SIZE(Dx).
- Scurrent = Scurrent + Sx.
- If Scurrent >
CHUNK_SIZE
:- Let Scurrent = Scurrent -
CHUNK_SIZE
- Build the chunk [ NUM(Bx+1), HASH(Bx+1), TD(Bx + 1), Dx + 1, Dx + 2, ..., DBtarget ]
- Set Btarget = Bx.
- Let Scurrent = Scurrent -
At the end, if Scurrent > 0, write out the remaining chunk from block Bfinish to Btarget: [ NUM(Bfinish), HASH(Bfinish), TD(Bfinish), Dfinish, ..., DBtarget ]
The manifest's block_chunks
list must contain the chunks' hashes in an ordering such that the hash of any given chunk precedes no hashes of any other chunk which contains blocks with a higher number.
State chunks store the entire state of a given block. A "rich" account structure is used to save space.
Each state chunk consists of a list of lists, each with two items: an address' sha3
hash and a rich account structure correlating with it.
We call these lists "account entries".
[ [hash1: B_32, acc_1: P], [hash_2: B_32, acc_2: P], ... ]
.
We define the ordering of account to be first lexicographic in the address hash, and if those are equal, then lexicographic in the storage lists of the rich account structure.
The rich account structure encodes the usual account data such as the nonce, balance, and code, as well as the full storage.
More formally, it is an RLP list in the following format:
[
nonce: B_32,
balance: B_32,
code_flag: B_1,
code: P,
storage: [[keyhash1: B_32, val1: B_32], [keyhash2: B_32, val2: B_32], ...]
]
code_flag
is a single byte which will determine what the code
data will be:
- if
0x00
, the account has no code andcode
is the single byte0x80
, signifying RLP empty data. - if
0x01
, the account has code, andcode
stores an arbitrary-length list of bytes containing the code. - if
0x02
, the account has code, andcode
stores a 32-byte big-endian integer which is the hash of the code. The code's hash must be substituted if and only if another account which has a smaller account entry has the same code.
storage
is a list of the entire account's storage, where the items are RLP lists of length two -- the first item being sha3(key)
, and the second item being the storage value. This storage list must be sorted in ascending order by key-hash.
If storage
is large enough that the rich account structure would bring the internal size (see the Validity section) of the chunk to over CHUNK_SIZE
, only the prefix of storage
that would keep the internal size of the chunk within CHUNK_SIZE
will be included. We will call the unincluded remainder storage'
. A new state chunk will begin with an account entry of the same account, but with storage set to the prefix of storage'
which will fit in the chunk, and so on.
Aside from those given above, there are a couple more requirements for a set of valid state chunks.
We define the internal size SC of a chunk C to be the sum of the sizes of the RLP lists contained within.
Any given chunk C has a valid internal size SC if and only if SC <= CHUNK_SIZE
.
A set of state chunks S is valid if and only if:
- for any two arbitrary selected account entries A1 and A2 from any given state chunk Si , where A1 < A2, there exists no A3 from another state chunk Sj such that A1 < A3 < A2 .
- there is no other valid configuration of chunks containing the same data such that for each chunk Ci , except the one containing the maximum account entry, SCi is a valid internal size. In plainer terms, every chunk except the last must be "maximally packed".
- the account entries within any given state chunk must be sorted in ascending order.
- if there is only one list A1 in a state chunk, then there may be no other state chunk containing two account entries A2 and A3 such that A2 < A1 < A3.
The state_chunks
list in the snapshot manifest must be sorted by the first account entry contained within each one.
This version introduces the version
field in the manifest and adds