Expand description

This is a ground-up introduction to the different kinds of snapshot files, covering:

  1. Actors in Filecoin.
  2. The Filecoin Blockchain
  3. The Filecoin State Tree
  4. (Finally) snapshots

§Actors

The Filecoin Virtual Machine (FVM) hosts a number of actors. These are objects that maintain and mutate internal state, and communicate by passing messages.

An example of an actor is the cron actor. Its internal state is a to-do list of other actors to invoke every epoch.

See the Filecoin docs for more information about actors.

§The Filecoin blockchain

Filecoin consists of a blockchain of messages. Listed below are the core objects for the blockchain. Each one can be addressed by a Cid.

  • Messages are statements of messages between the actors. They describe and (equivalently) represent a change in the state tree (see below). See apply_block_messages to learn more. Messages may be signed.
  • Messages are grouped into Blocks, with a single header. These are what are mined by miners to get FIL (money). They define an epoch and a parent tipset. The epoch is a monotonically increasing number from 0 (genesis).
  • Blocks are grouped into Tipsets. All blocks in a tipset share the same epoch.
     ┌───────────────────────────────┐
     │ BlockHeader { epoch:  0, .. } │ //  The genesis block/tipset
  ┌● └───────────────────────────────┘
  ~
  └──┬───────────────────────────────┐
     │ BlockHeader { epoch: 10, .. } │ // The epoch 10 tipset - one block with two messages
  ┌● └┬──────────────────────────────┘
  │   │
  │   │ "I contain the following messages..."
  │   │
  │   ├──────────────────┐
  │   │ ┌──────────────┐ │ ┌───────────────────┐
  │   └►│ Message:     │ └►│ Message:          │
  │     │  Afri -> Bob │   │  Charlie -> David │
  │     └──────────────┘   └───────────────────┘
  │
  │ "my parent is..."
  │
  └──┬───────────────────────────────┐
     │ BlockHeader { epoch: 11, .. } │ // The epoch 11 tipset - one block with one message
  ┌● └┬──────────────────────────────┘
  │   │ ┌────────────────┐
  │   └►│ Message:       │
  │     │  Eric -> Frank │
  │     └────────────────┘
  │
  │ // the epoch 12 tipset - two blocks, with a total of 3 messages
  │
  ├────────────────────────────────────┐
  └──┬───────────────────────────────┐ └─┬───────────────────────────────┐
     │ BlockHeader { epoch: 12, .. } │   │ BlockHeader { epoch: 12, .. } │
  ┌● └┬──────────────────────────────┘   └┬─────────────────────┬────────┘
  ~   │ ┌───────────────────────┐         │ ┌─────────────────┐ │ ┌──────────────┐
      └►│ Message:              │         └►│ Message:        │ └►│ Message:     │
        │  Guillaume -> Hailong │           │  Hubert -> Ivan │   │  Josh -> Kai │
        └───────────────────────┘           └─────────────────┘   └──────────────┘

The ChainMuxer receives two kinds of messages from peers:

It assembles these messages into a chain to genesis.

Filecoin implementations store all the above in the ChainStore, per the spec.

§The Filecoin state tree

Messages describe/represent mutations in the StateTree, which is a representation of all Filecoin state at a point in time. For each actor, the StateTree holds the CID for its state: ActorState.state.

Actor state is serialized and stored as Ipld. Think of this as “JSON with links (Cids)”. So the cron actor’s state mentioned above will be ultimately serialized into Ipld and stored in the StateStore, per the spec.

It isn’t feasible to create a new copy of actor states whenever they change. That is, in a fictional 1 example of a cron actor, starting with a crontab with 10 items, mutation of the state should not simply duplicate the state:

Previous state             Current state
┌───────────────────────┐  ┌───────────────────────┐
│Crontab                │  │Crontab                │
│1. Get out of bed      │  │1. Get out of bed      │
│2. Shower              │  │2. Shower              │
│...                    │  │...                    │
│10. Take over the world│  │10. Take over the world│
└───────────────────────┘  │11. Throw a party      │
                           └───────────────────────┘

But should instead be able to refer to the previous state:

Previous state             Current state
┌───────────────────────┐  ┌─────────────────┐
│Crontab                │◄─┤(See CID...)     │
│1. Get out of bed      │  ├─────────────────┤
│2. Shower              │  │11. Throw a party│
│...                    │  └─────────────────┘
│10. Take over the world│
└───────────────────────┘

And removal of e.g the latest entry works similarly, orphaning the removed item.

Previous state             Orphaned item        Current state
┌───────────────────────┐                       ┌────────────┐
│Crontab                │◄──────────────────────┤(See CID...)│
│1. Get out of bed      │  ┌─────────────────┐  └────────────┘
│2. Shower              │  │11. Throw a party│
│...                    │  └─────────────────┘
│10. Take over the world│
└───────────────────────┘

Data structures that reach into the past of the StateStore like this are:

Therefore, the Filecoin state is, indeed, a tree of IPLD data. It can be addressed by the root of the tree, so it is often referred to as the state root.

We will now introduce some new terminology given the above information.

With respect to a particular IPLD Blockstore:

  • An item such a list is fully inhabited if all its recursive Ipld::Links exist in the blockstore.
  • Otherwise, an item is only partially inhabited. The links are said to be “dead links”.

With respect to a particular StateTree:

  • An item is orphaned if it is not reachable from the current state tree through any links.

§Snapshots

Recall that for each message execution, the state tree is mutated. Therefore, each epoch is associated with a state tree after execution, and a parent state tree.

                                           // state after execution of
                                           // all messages in that epoch
     ┌───────────────────────────────┐ ┌────────────┐
     │ BlockHeader { epoch:  0, .. } │ │ state root ├──► initial actor states...
  ┌● └───────────────────────────────┘ └────────────┘                    ▲   ▲
  ~                                        // links to redundant data ─● │   │
  └──┬───────────────────────────────┐ ┌────────────┐                    │   │
     │ BlockHeader { epoch: 11, .. } │ │ state root ├─┬► actor state ─► AMT  │
  ┌● └┬──────────────────────────────┘ └────────────┘ ~                      │
  │   │ ┌─────────┐                                   └► actor state ─► HAMT ┘
  │   └►│ Message │                                                      │
  │     └─────────┘                                                      ▼
  ├──┬───────────────────────────────┐     // new data in this epoch ─● IPLD
  │  │ BlockHeader { epoch: 12, .. } │
  │  └┬─────────────┬────────────────┘
  │   │ ┌─────────┐ │ ┌─────────┐
  │   └►│ Message │ └►│ Message │
  │     └─────────┘   └─────────┘                                        ~   ~
  └──┬───────────────────────────────┐ ┌────────────┐                    │   │
     │ BlockHeader { epoch: 12, .. } │ │ state root ├─┬► actor state ─► AMT  │
  ┌● └┬──────────────────────────────┘ └────────────┘ ~                      │
  ~   │ ┌─────────┐                                   └► actor state ─► HAMT ┘
      └►│ Message │
        └─────────┘

We are now ready to define the different snapshot types for a given epoch N.

  • A lite snapshot contains:
    • All block headers from genesis to epoch N.
    • For the last W (width) epochs:
      • The fully inhabited state trees.
      • The messages.
    • For epochs 0..N-W, the state trees will be dead or partially inhabited.
  • A full snapshot contains:
    • All block headers from genesis to epoch N.
    • The fully inhabited state trees for epoch 0..N
  • A diff snapshot contains:
    • For epoch N-W..N:
      • The block headers.
      • The messages.
      • New data in that epoch, which will be partially inhabited

Successive diff snapshots may be concatenated:

  • From genesis, to produce a full snapshot.
  • From a lite snapshot, to produce a successive lite snapshot.

  1. The real cron actor doesn’t mutate state like this.