Forest is an implementation of Filecoin written in Rust. The implementation takes a modular approach to building a full Filecoin node in two parts — (i) building Filecoin’s security critical systems in Rust from the Filecoin Protocol Specification, specifically the virtual machine, blockchain, and node system, and (ii) integrating functional components for storage mining and storage & retrieval markets to compose a fully functional Filecoin node implementation.
Functionality
- Filecoin State Tree Synchronization
- Filecoin JSON-RPC Server
- Ergonomic Message Pool
- Wallet CLI
- Process Metrics & Monitoring
Disclaimer
The Forest implementation of the Filecoin protocol is alpha software which should not yet be integrated into production workflows. The team is working to provide reliable, secure, and efficient interfaces to the Filecoin ecosystem. If you would like to chat, please reach out over Discord on the ChainSafe server linked above.
Basic Usage
Build
Dependencies
- Rust
rustc >= 1.58.1
- Rust WASM target
wasm32-unknown-unknown
rustup install stable
rustup target add wasm32-unknown-unknown
- OS Base-Devel/Build-Essential
- Clang compiler
- OpenCL bindings
# Ubuntu
sudo apt install build-essential clang
# Archlinux
sudo pacman -S base-devel clang
Commands
make release
Forest Import Snapshot Mode
Before running forest
in the normal mode you must seed the database with the
Filecoin state tree from the latest snapshot. To do that, we will download the
latest snapshot provided by Protocol Labs and start forest
using the
--import-snapshot
flag. After the snapshot has been successfully imported, you
can start forest
without the --import-snapshot
flag.
Commands
Download the latest snapshot provided by Protocol Labs:
curl -sI https://fil-chain-snapshots-fallback.s3.amazonaws.com/mainnet/minimal_finality_stateroots_latest.car | perl -ne '/x-amz-website-redirect-location:\s(.+)\.car/ && print "$1.sha256sum\n$1.car"' | xargs wget
If desired, you can check the checksum using the instructions here.
Import the snapshot using forest
:
forest --target-peer-count 50 --encrypt-keystore false --import-snapshot /path/to/snapshot/file
Forest Synchronization Mode
Commands
Mainnet
Start the forest
node:
forest --target-peer-count 50 --encrypt-keystore false
CLI
The Forest CLI allows for operations to interact with a Filecoin node and the blockchain.
Environment Variables
For nodes not running on a non-default port, or when interacting with a node
remotely, you will need to provide the multiaddress information for the node.
You will need to either set the environment variable FULLNODE_API_INFO
, or
prepend it to the command, like so:
FULLNODE_API_INFO="..." forest wallet new -s bls
On Linux, you can set the environment variable with the following syntax
export FULLNODE_API_INFO="..."
Setting your API info this way will limit the value to your current session. Look online for ways to persist this variable if desired.
The syntax for the FULLNODE_API_INFO
variable is as follows:
<admin_token>:/ip4/<ip of host>/tcp/<port>/http
This will use IPv4, TCP, and HTTP when communicating with the RPC API. The admin token can be found when starting the Forest daemon. This will be needed to create tokens with certain permissions such as read, write, sign, or admin.
Token flag
For nodes running on default port and when you are interacting locally, the
admin token can also be set using --token
flag:
forest-cli --token eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJBbGxvdyI6WyJyZWFkIiwid3JpdGUiLCJzaWduIiwiYWRtaW4iXSwiZXhwIjoxNjczMjEwMTkzfQ.xxhmqtG9O3XNTIrOEB2_TWnVkq0JkqzRdw63BdosV0c <subcommand>
Sending Filecoin tokens from your wallet
For sending Filecoin tokens, the Forest daemon must be running. You can do so by running:
forest --chain calibnet
Next, send Filecoin tokens to a wallet address:
forest-cli --token <admin_token> send <wallet-address> <amount in attoFIL>
where 1 attoFIL = $10^{−18}$ FIL.
Wallet
Filecoin wallets are stored under the Forest data directory (e.g.,
~/.local/share/forest
in the case of Linux) in a keystore
file.
All wallet commands require write permissions and an admin token (--token
) to
interact with the keystore. The admin token can be retrieved from forest startup
logs or by running it with --save-token <PATH>
.
Balance Retrieve the FIL balance of a given address Usage:
forest-cli wallet balance <address>
Default Get the default, persisted address from the keystore Usage:
forest-cli wallet default
Has Check if an address exists in the keystore shows true/false if exists or
doesn't Usage: forest-cli wallet has <address>
List Display the keys in the keystore Usage: forest-cli wallet list
New Create a new wallet The signature type can either be secp256k1 or bls.
Defaults to use bls Usage: forest-cli wallet new <bls/secp256k1>
Set-default Set an address to be the default address of the keystore Usage:
forest-cli wallet set-default <address>
Import Import a private key to the keystore and create a new address. The
default format for importing keys is hex encoded JSON. Use the export
command
to get formatted keys for importing. Usage:
forest-cli wallet import <hex encoded json key>
Export Export a key by address. Use a wallet address to export a key. Returns a formatted key to be used to import on another node, or into a new keystore. Usage: forest-cli wallet export
Sign Use an address to sign a vector of bytes Usage:
forest-cli wallet sign -m <hex message> -a <address>
Verify Verify the message's integrity with an address and signature Usage:
forest-cli wallet verify -m <hex message> -a <address> -s <signature>
Chain-Sync
The chain-sync CLI can mark blocks to never be synced, provide information about the state of the syncing process, and check blocks that will never be synced (and for what reason).
Wait Wait for the sync process to be complete Usage: forest-cli sync wait
Permissions: Read
Status Check the current state of the syncing process, displaying some
information Usage: forest-cli sync status
Permissions: Read
Check Bad Check if a block has been marked by, identifying the block by CID
Usage: forest-cli sync check-bad -c <block cid>
Permissions: Read
Mark Bad Mark a block as bad, the syncer will never sync this block Usage:
forest-cli sync mark-bad -c <block cid>
Permissions: Admin
Configuration
The forest
process has a set of configurable values which determine the
behavior of the node. All values can be set through process flags or through a
configuration file. If a configuration is provided through the flag and the
configuration file, the flag value will be given preference.
Flags
When starting forest
you can configure the behavior of the process through the
use of the following flags:
Flag | Value | Description |
---|---|---|
--config | OS File Path | Path to TOML file containing configuration |
--genesis | OS File Path | CAR file with genesis state |
--rpc | Boolean | Toggles the RPC API on |
--port | Integer | Port for JSON-RPC communication |
--token | String | Client JWT token to use for JSON-RPC authentication |
--metrics-port | Integer | Port used for metrics collection server |
--kademlia | Boolean | Determines whether Kademilia is allowed |
--mdns | Boolean | Determines whether MDNS is allowed |
--import-snapshot | OS File Path | Path to snapshot CAR file |
--import-chain | OS File Path | Path to chain CAR file |
--skip-load | Boolean | Skips loading CAR File and uses header to index chain |
--req-window | Integer | Sets the number of tipsets requested over chain exchange |
--tipset-sample-size | Integer | Number of tipsets to include in the sample which determines the network head during synchronization |
--target-peer-count | Integer | Amount of peers the node should maintain a connection with |
--encrypt-keystore | Boolean | Controls whether the keystore is encrypted |
Configuration File
Alternatively, when starting forest
you can define a TOML configuration file
and provide it to the process with the --config
flag or through the
FOREST_CONFIG_PATH
environment variable.
The following is an sample configuration file:
genesis = "/path/to/genesis/file"
rpc = true
port = 1234
token = "0394j3094jg0394jg34g"
metrics-port = 2345
kademlia = true
mdns = true
import-snapshot = /path/to/snapshot/file
import-chain = /path/to/chain/file
skip-load = false
req-window = 100
tipset-sample-size = 10
target-peer-count = 100
encrypt-keystore = false
Forest in Docker🌲❤️🐋
Prerequisites
- Docker engine installed and running. Forest containers are confirmed to run on
the following engines:
- Docker Engine (Community) on Linux,
- Docker for macOS
- Podman on WSL
Native images are available for the following platforms:
linux/arm64
linux/amd64
The images will work out-of-the box on both Intel processors and macOS with M1/M2.
Tags
For the list of all available tags please refer to the Forest packages.
Currently, the following tags are produced:
latest
- latest stable release,edge
- latest development build of themain
branch,date-digest
e.g.,2023-02-17-5f27a62
- all builds that landed on themain
branch,- release tags, available from
v.0.7.0
onwards.
Security recommendations
- We strongly recommend running the docker daemon in rootless mode
(installation instructions),
or running the daemon-less docker alternative
podman
(installation instructions) with non-root user and putalias docker = podman
(or manually replace thedocker
commands withpodman
in below instructions)
Performance recommendations
- We recommend lowering the swappiness kernel parameter on linux to 1-10 for
long running forest node by doing
sudo sysctl -w vm.swappiness=[n]
.
Usage
List available flags and/or commands
# daemon
❯ docker run --init -it --rm ghcr.io/chainsafe/forest:latest --help
# cli
❯ docker run --init -it --rm --entrypoint forest-cli ghcr.io/chainsafe/forest:latest --help
Create a Forest node running calibration network. Then list all connected peers.
❯ docker run --init -it --rm --name forest ghcr.io/chainsafe/forest:latest --chain calibnet --auto-download-snapshot
then in another terminal (sample output)
❯ docker exec -it forest forest-cli net peers
12D3KooWAh4qiT3ZRZgctVJ8AWwRva9AncjMRVBSkFwNjTx3EpEr, [/ip4/10.0.2.215/tcp/1347, /ip4/52.12.185.166/tcp/1347]
12D3KooWMY4VdMsdbFwkHv9HxX2jZsUdCcWFX5F5VGzBPZkdxyVr, [/ip4/162.219.87.149/tcp/30141, /ip4/162.219.87.149/tcp/30141/p2p/12D3KooWMY4VdMsdbFwkHv9HxX2jZsUdCcWFX5F5VGzBPZkdxyVr]
12D3KooWFWUqE9jgXvcKHWieYs9nhyp6NF4ftwLGAHm4sCv73jjK, [/dns4/bootstrap-3.calibration.fildev.network/tcp/1347]
Use a shared volume to utilise across different Forest images
Create the volume
docker volume create forest-data
Now, whenever you create a new Forest container, attach the volume to where the
data is stored /home/forest/.local/share/forest
.
❯ docker run --init -it --rm \
--ulimit nofile=8192 \
--volume forest-data:/home/forest/.local/share/forest \
--name forest ghcr.io/chainsafe/forest:latest --chain calibnet
--auto-download-snapshot
Export the calibnet snapshot to the host machine
Assuming you have forest
container already running, run:
❯ docker exec -it forest forest-cli --chain calibnet snapshot export
Export completed. Snapshot located at forest_snapshot_calibnet_2023-02-17_height_308891.car
Copy the snapshot to the host
❯ docker cp forest:/home/forest/forest_snapshot_calibnet_2023-02-17_height_308891.car .
Create and fund a wallet, then send some FIL on calibration network
Assuming you have forest
container already running, you need to find the JWT
token in the logs.
❯ docker logs forest | grep "Admin token"
export it to an environmental variable for convenience (sample, use the token you obtained in the previous step)
export JWT_TOKEN=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJBbGxvdyI6WyJyZWFkIiwid3JpdGUiLCJzaWduIiwiYWRtaW4iXSwiZXhwIjoxNjgxODIxMTc4fQ.3toXEeiGcHT01pUjQeqMyW2kZmQpqpE4Gi4vOHjX4rE
Create the wallet
❯ docker exec -it forest forest-cli --chain calibnet --token $JWT_TOKEN wallet new
t1uvqpa2jgic7fhhko3w4wf3kxj36qslvqrk2ln5i
Fund it using the faucet. You can verify it was funded after a few minutes in Filscan by pasting the Message ID obtained from the faucet. Example from this wallet.
Verify that your account has 100 FIL . The result is in attoFIL
.
❯ docker exec -it forest forest-cli --chain calibnet --token $JWT_TOKEN wallet balance t1uvqpa2jgic7fhhko3w4wf3kxj36qslvqrk2ln5i
100000000000000000000
Create another wallet
❯ docker exec -it forest forest-cli --chain calibnet --token $JWT_TOKEN wallet new
t1wa7lgs7b3p5a26abkgpxwjpw67tx4fbsryg6tca
Send 10 FIL from the original wallet to the new one
❯ docker exec -it forest forest-cli --chain calibnet --token $JWT_TOKEN send --from t1uvqpa2jgic7fhhko3w4wf3kxj36qslvqrk2ln5i t1wa7lgs7b3p5a26abkgpxwjpw67tx4fbsryg6tca 10000000000000000000
Verify the balance of the new address. Sample transaction for this wallet.
❯ docker exec -it forest forest-cli --chain calibnet --token $JWT_TOKEN wallet balance t1wa7lgs7b3p5a26abkgpxwjpw67tx4fbsryg6tca
10000000000000000000
Forest JavaScript Console
Forest console stands as an alternative to tools like
Curl or other subcommands found in forest-cli
for interacting with the Filecoin JSON-RPC API.
Starting the console
forest-cli attach
can be used to open a Javascript console connected to your
Forest node.
Like for some other forest-cli
subcommands you will need to pass an admin
--token
given what endpoints you will call.
For a description of different options please refer to the developer documentation CLI page.
Interactive Use
First start Forest node inside another terminal:
forest --chain calibnet
To attach to your Forest node, run forest-cli
with the attach
subcommand:
forest-cli --token <TOKEN> attach
You should now see a prompt and be able to interact:
Welcome to the Forest Javascript console!
To exit, press ctrl-d or type :quit
> console.log("Forest running on " + chainGetName())
Forest running on calibnet
You can directly call JSON-RPC API endpoints that are bound to the console.
For example, Filecoin.ChainGetName
is bound to the global chainGetName
function.
Tips
- The console history is saved in your
~/.forest_history
after exiting. - Use
:clear
to erase current session commands. - Use
_BOA_VERSION
to get engine version
Non-interactive Use
Exec Mode
It is also possible to execute commands non-interactively by passing --exec
flag and a JavaScript snippet to forest-cli attach
. The result is displayed
directly in the terminal rather than in the interactive console.
For example, to display the current epoch:
forest-cli attach --exec "syncStatus().ActiveSyncs[0].Epoch"
Or print wallet default address:
forest-cli attach --exec "console.log(walletDefaultAddress())"
Builtins
Helpers
Forest console comes with a number of helper functions that make interacting with Filecoin API easy:
showPeers()
getPeer(peerID)
disconnectPeers(count)
isPeerConnected(peerID)
showWallet()
showSyncStatus()
sendFIL(to, amount)
(default amount unit is FIL)
Timers
In addition, to support part of the JavaScript language, the console also
provides implementation for sleep(seconds)
timer and a tipset based timer,
sleepTipsets(epochs)
, which sleeps till the number of new tipsets added is
equal to or greater than epochs
.
Modules
CommonJS modules is the way to package JavaScript code for Forest console. You
can import modules using the require
function:
forest-cli attach --exec "const Math = require('calc'); console.log(Math.add(39,3))"
where calc.js
is:
module.exports = {
add: function (a, b) {
return a + b;
},
multiply: function (a, b) {
return a * b;
},
};
By default modules will be loaded from the current directory. Use --jspath
flag to indicate another path.
Limitations
Forest's console is built using
Boa Javascript engine. It does support
promises or async
functions, but keep in mind that Boa is not fully compatible
with ECMAScript yet.
Not every endpoint from the Filecoin API has been bound to the console. Please create an issue if you need one that is not available.
Trouble Shooting
Common Issues
File Descriptor Limits
By default, Forest will use large database files (roughly 1GiB each). Lowering
the size of these files lets RocksDB use less memory but runs the risk of
hitting the open-files limit. If you do hit this limit, either increase the file
size or use ulimit
to increase the open-files limit.
Algorand
Algorand is a proof-of-stake blockchain cryptocurrency protocol.
BLAKE2b
BLAKE2 is a cryptographic hash function based on BLAKE. The design goal was to replace the widely used, but broken, MD5 and SHA-1 algorithms in applications requiring high performance in software.
BLS
BLS stands for Boneh–Lynn–Shacham cryptographic signature scheme, which is a cryptographic signature scheme which allows a user to verify that a signer is authentic.
CBOR
CBOR stands for the Concise Binary Object Representation, which is a data format whose design goals include the possibility of extremely small code size, fairly small message size, and extensibility without the need for version negotiation.
CID
CID is short for Content Identifier, a self describing content address used throughout the IPFS ecosystem. CIDs are used in Filecoin to identify files submitted to the decentralized storage network. For more detailed information, see the github documentation for it.
IPLD
IPLD stands for InterPlanetary Linked Data, which is a series of standards and formats for describing data in a content-addressing-emphatic way.
JWT
JWT stands for JSON Web Token, which is a proposed Internet standard for creating data with optional signature and/or optional encryption whose payload holds JSON that asserts some number of claims. The tokens are signed either using a private secret or a public/private key.
SECP
types of elliptic curves used for ECDSA see here
multisig
A multi-signature (multisig)
wallet refers to a wallet that requires multiple keys to authorize a FIL
transactions.
Tipset
Tipset is a structure that contains a non-empty collection of blocks that have distinct miners and all specify identical epoch, parents, weight, height, state root, receipt root
Tipsetkey
A set of CIDs forming a unique key for a tipset.
mempool
mempool stands for the Message Pool, which is the component of forest that handles pending messages for inclusion in the chain. Messages are added either directly for locally published messages or through pubsub propagation.
Merkle
Merkle tree is a tree in which every node is labelled with the cryptographic hash of a data block, and every node that is not a leaf (called a branch, inner node, or inode) is labelled with the cryptographic hash of the labels of its child nodes. A hash tree allows efficient and secure verification of the contents of a large data structure.
IPFS
IPFS stands for InterPlanetary File System which is a peer-to-peer hypermedia protocol to make the web faster, safer, and more open.
Proof of Spacetime (PoSt)
PoSt stands for Proof-of-Spacetime is a procedure by which a storage-miner can prove to the Filecoin network they have stored and continue to store a unique copy of some data on behalf of the network for a period of time.
HAMT
HAMT stands for Hash array mapped trie, which is an implementation of an associative array that combines the characteristics of a hash table and an array mapped trie.
VRF
VRF stands for a Verifiable Random Function that receives a Secret Key (SK) and a seed and outputs proof of correctness and output value. VRFs must yield a proof of correctness and a unique & efficiently verifiable output.
MDNS
MDNS stands for Multicast DNS, which is a protocol, that resolves hostnames to IP addresses within small networks that do not include a local name server.
Kademlia
Kademlia is a distributed hash table for decentralized peer-to-peer computer networks.
LibP2P
LibP2P is a modular system of protocols, specifications and libraries that enable the development of peer-to-peer network applications.
Developer documentation
In this section you will find resources targeted for Forest developers.
The application architecture of forest
largely mirrors that of lotus
:
- There is a core
StateManager
, which accepts:
For more information, see the lotus documentation, including, where relevant, the filecoin specification.
(These also serve as a good introduction to the general domain, assuming a basic familiarity with blockchains.)
Contribuiting to Forest
Submitting Code
Please use make lint
to ensure code is properly formatted, license headers are
present, and to run the linter.
Documentation
Please use the following guidelines while documenting code. Except for inline
comments, these should all be doc comments (///
).
Methods/Functions
- At least a brief description
Structs
- At least a brief description
Traits
- At least a brief description
Enums
- At least a brief overall description.
- All variants should have a brief description.
Inline Comments
- Any complicated logic should include in-line comments (
//
)
Testing for Mainnet Compatibility
Forest development can be like hitting a moving target and sometimes Forest falls behind the network. This document should serve as a way to easily identify if Forest can sync all the way up to the network head using a simple step-by-step process.
Prerequisites
Some command-line tools and software is required to follow this guide.
- A fresh copy of the Forest repository that has been built
- Lotus installed
- curl (to download snapshots)
- sha256sum (optional, used to verify snapshot integrity)
Grab a snapshot and run Forest
Refer to the mdbook documentation on how to download a snapshot and run forest
Warning: FileCoin snapshots as of this writing are over 75GB. Verify you have enough space on your system to accommodate these large files.
- Use
make mdbook
in Forest's root directory - Open
http://localhost:3000
- Navigate to
2. Basic Usage
in the menu on the right - Scroll down to
Forest Import Snapshot Mode
Let Forest sync
This step may take a while. We want Forest to get as far along in the syncing process as it can get. If it syncs up all the way to the network head, CONGRATS! Forest is up to date and on mainnet. Otherwise, Forest is not on mainnet.
If Forest starts to error and can't get past a block while syncing. Make note of which block it is. We can use that block to help debug any potential state mismatches.
Is Forest on the latest network version?
Something easy to check is if Forest is on the latest Filecoin network version. A repository exists where we can see all of the released network versions here. Navigate the codebase to see mention of the latest network upgrade. If a snapshot fails to sync at a certain epoch, it's entirely possible that the snapshot was behind an epoch when a version upgrade started. Grab a new snapshot by referring to the mdbook documentation.
Debugging State Mismatches
Statediffs can only be printed if we import a snapshot containing the stateroot data from Lotus. This means there will not be a pretty statediff if Forest is already synced to the network when the stateroot mismatch happens. By default, snapshots only contain stateroot data for the previous 2000 epochs. So, if you have a statediff at epoch X, download a snapshot for epoch X+100 and tell Forest to re-validate the snapshot from epoch X.
For more detailed instructions, follow this document
FVM Traces
Within FVM, we can enable tracing to produce execution traces. Given an offending epoch, we can produce them both for Forest and for Lotus to find mismatches.
To confirm: the execution traces format is not uniform across implementations, so it takes a certain amount of elbow grease to find the differences. Lotus is capable of spitting this out in JSON for nice UX
In case of memory leaks, either coming from unsafe libraries or just Forest pushing shamelessly into some collection, it is useful to not guess where the leak happened but to use proper tooling.
HeapTrack
Installation
Either build it with the instructions provided in the repository or download a
ready AppImage, e.g. from
here. You may not want to use
the heaptrack
available in your OS packages as it may be a bit outdated.
Preparation
To get the most out of the tool, you may want to add debug information to the binary, regardless if you are running it in release or debug mode.
[profile.dev]
debug = 2
[profile.release]
debug = 2
Usage
You can grab the trace on your host machine or in a VPS (e.g. Digital Ocean Droplet).
Start tracing with heaptrack <normal forest command>
, e.g.
heaptrack target/release/forest --encrypt-keystore=false --target-peer-count 50 --chain calibnet --import-snapshot forest_snapshot.car
This will push traces to a file, e.g. heaptrack.forest.12345.gz
. The longer
your process will be running, the bigger it will get, so double check your free
space before leaving it overnight.
Now analyze the trace. You can do it after Forest has e.g. crashed due to OOM or
even during its execution. If you were capturing traces in a Droplet, copy the
file to your host, e.g.
scp chainsafe@123.45.66.77:/home/chainsafe/heaptrack.forest.12345.gz .
.
Depending on the size of the trace, it may take a while (but there is a nice progress bar so you will know if you can grab a coffee in the meantime).
heaptrack --analyze heaptrack.forest.12345.gz
Summary
Here we can see memory usage overview. Keep in mind that leaks here are not necessarily leaks - it's just memory that hasn't been yet freed. A global cache would always show as a leak.
While most of the potential culprits are not necessarily interesting (e.g.
alloc::*
) because even a String
constructor calls them, we immediately see
that among specific ones, it's the rocksdb
that gets into the spotlight.
Bottom-up
View in which you see low-level methods first. In such view, the first methods
would almost always be allocator methods, finally unwinding into main
.
Caller/callee
All the methods called along with their allocations, where one can easily
navigate between their callers and callees, also showing you the location in
code (you can configure heaptrack
to take you to that code with
Settings/Code Navigation
). Most useful tab when you delve into the details.
Top-down
Basically an inverse of Bottom-up view. High-level methods first, then you can drill down.
Flamegraph
A graphical form of Bottom-up and Top-Down (you can switch). Helps with visualizing the heavy allocators.
Consumed
Shows the heap memory consumption over time. Here we can notice some patterns, e.g. what happens with memory during snapshot import, then downloading headers and syncing.
Allocations
Shows total number of allocations over time.
Temporary allocations
Shows the number of temporary allocations over time. Temporary allocation is an allocation followed by its deallocation, i.e. there are no other allocations in-between.
Sizes
This tab will show you the allocation sizes during runtime and their frequency.
If you hover over a bar you will see that e.g. LZ4_createStream
(most likely
used by rocksdb
) made 5,624,180 allocations, total 92.3G, on average 14.4kB
per allocation.
Miscellaneous
- Keep in mind that running Forest with heaptrack gives a non-negligible memory and CPU overhead. You may not be able to run mainnet node on a 16G machine even if normally it would be fine.
- Optimizations may play tricks on the developer, e.g. inlining functions so
they won't even appear in your trace. If you think a particular method should
have been called but for mysterious reasons it does not appear in the
analysis, you may want to put
#[inline(never)]
on top of it. Analyzing a debug build may also be useful, but depending on where the leak happens, it may be too slow. - There is a lot of noise coming from dependencies and standard library. It's useful to mentally filter them out a bit and focus on the biggest culprits in Forest methods. Flamegraph and caller/callee view are the most useful for this.
Forest follows a fixed, quarterly release schedule. On the last week of each quarter, a new version is always released. This is supplemented with additional releases for bug fixes and special features. A "release officer" is appointed for each release and they are responsible for either following the checklist or, in case of absence, passing the task to a different team member.
- Update the CHANGELOG.md file to reflect all changes and preferably write a
small summary about the most notable updates. The changelog should follow the
design philosophy outlined here: https://keepachangelog.com/en/1.0.0/. Go
through the output of
git log
and remember that the audience of the CHANGELOG does not have intimate knowledge of the Forest code-base. All the changed/updated/removed features should be reasonably understandable to an end-user. - Update that version of the crates that are to be released. Forest contains many crates so you may need to update many Cargo.toml files. If you're working on a patch release, make sure that there are no breaking changes. Cherry-picking of patches may be necessary.
- Run the manual tests steps outlined in the TEST_PLAN.md. Caveat: Right now there are no manual test steps so this step can be skipped.
- Once the changes in step 1 and step 2 have been merged, tag the commit with the new version number. The version tag should start with a lowercase 'v'. Example: v0.4.1
- (NOT YET APPLICABLE) Publish new crates on crates.io.
- Go to https://github.com/ChainSafe/forest/releases/new and create a new release. Use the tag created in step 4, follow the title convention of the previous releases, and write a small summary of the release (similar or identical to the summary in the CHANGELOG.md file).
- Verify that the new release contains assets for both Linux and MacOS (the assets are automatically generated and should show up after 30 minutes to an hour).
- Verify that the new release is available in the Github Container Registry.
Use
docker pull ghcr.io/chainsafe/forest:<version>
and ensure that it is present in the packages - Update the Forest Progress wiki with the changes in the new release. If in doubt about what has been accomplished, is in progress, or what's included in the future plans, ask in the #fil-devs slack channel and tag authors of related PRs.
- Make sure the
Cargo.lock
change is included in the pull request.
Running Smoke Tests for Forest's RPC API
Prerequisites
The only requirement for running these smoke tests is that Forest is installed and on your system PATH.
Running the Tests
- Use
make install
to create a binary on your path - Run
make smoke-test
This will execute a blank request to all endpoints listed defined and check the
HTTP status code of the response. If a response is received, this should be
considered a good test, even if an error has occurred. No parameters are passed
to the API endpoints. An OK
will be displayed if a test passes, and a FAIL
will be displayed with an HTTP/curl code if a test fails.
Adding Future Endpoints
Endpoints in the script ./scripts/smoke_test.sh
are stored in an array
identified as RPC_ENDPOINTS
.
Add the endpoint identifier minus the prefix Forest
to the module that it
belongs to (ie gas, net, state, etc) or add a new section if a new API is added.
This should be checked during the review process if new API methods are added to keep this script and test suite up to date.
It's unclear how we can support migrations without adding a lot of code complexity. This document is meant to shed light on the matter and illuminate a sustainable path forward. As a start we will consider a migration going from nv15 to nv16.
Migration path investigation from nv15 to nv16
Findings
- Actor IDs definitely changed
For following actors only their CID have changed:
- init
- cron
- account
- power
- miner
- paymentchannel
- multisig
- reward
- verifiedregistry
Those are just simple code migration.
For system and market actors there's both code and state changes. That's why there is dedicated logic for their migration.
The system actor need to update the state tree with its new state that holds now
the ManifestData
CID.
For the market actor more work is involved to upgrade actor state due to support for UTF-8 string label encoding in deal proposals and pending proposals (see FIP-0027).
- Some gas calculations changed?
I don't think we are concerned by this. Gas metering can change at a given protocol upgrade for one or many actors but the impact is irrelevant as it doesn't modify blockchain data structures. Gas calculations should only impact code and in our case the nv16 ref-fvm is already supporting the new gas changes.
- drand calculation changed?
Ditto.
- What else changed?
Nothing else as far I can see.
Open questions
-
pre-migration framework + caching: how much do we need a similar approach in Forest? Are there other alternatives? We can definitely skip this part at first. For information the old nv12 state migration in forest took around 13-15 secs.
-
Seen in Lotus:
UpgradeRefuelHeight
. What's Refuel for? -
Migration logic is in spec-actors (go actors), what the future of this given clients moved to builtin-actors (rust actors) and ref-fvm? In an ideal world we might want a shared migration logic.
-
Implement Lite migration?
should allow for easy upgrades if actors code needs to change but state does not. Example provided above the function to perform all the migration duties. Check actors_version_checklist.md for the rest of the steps.
-
What are non-deferred actors in the context of a migration?
-
The
migrationJobResult
struct is using astates7
actor instead of astates8
one (in go spec-actors). Typo or are there some good reasons?
Changes rough proposal
To support nv15 to nv16 migration we need to:
- Make forest sync again on nv15 and be able to support multiple network versions.
- Understand existing forest migration framework (used in the past for nv12 migration). Can we reuse most of the code as is?
- Implementation of the nv16 migration logic (replicating same logic as in spec-actors).
- Implementation of unit tests covering this migration.
- Implementation of a migration schedule that will select the right migration path.
- Test migration using the exported calibnet and mainnet snapshots and respectively measure the elapsed time and memory usage.
Test snapshots
For testing a calibnet migration two snapshots have been exported with Lotus:
- lotus_snapshot_2022-Aug-5_height_1044460.car
- lotus_snapshot_2022-Aug-5_height_1044659.car
They are respectively exported 200 and 1 epochs before the Skyr upgrade (the 200 version could be useful if we decide to implement a pre-migration like in Lotus).
For testing a mainnet migration, one snapshot has been retrieved from Protocol Labs s3 bucket using the lily-shed util:
- minimal_finality_stateroots_1955760_2022-07-05_00-00-00.car
This one is 4560 epochs before. If needed we can extract closer snapshots later.
Those snapshots have been uploaded to our Digital Ocean Spaces.
Additional resources
what changed
between versions is maintained in the
tpm repo,
e.g. all the changes in
NV15 -> NV16
State migration guide ⏩
This guide is intended to help to implement new state migration in the future. It will be based on the current state migration implementation for NV18 and NV19.
State migration requirements
- The proper actor bundle is released for at least the test network. It should be available on the actor bundles repository. You can verify which upgrade needs which bundle in the network upgrade matrix.
- The state migration should be implemented in the Go library. This is the source of truth for the state migration. Also, we should carefully analyze the FIPs and implement the migration based on them. In case of doubt, we should always consider the FIPs as the source of truth and reach out to the Lotus team if we find potential issues in their implementation.
Development
Import the actor bundle
The first step is to import the actor bundle into Forest. This is done by:
- adding the bundle to the
HeightInfos
struct in the network definitions files (e.g., calibnet).
#![allow(unused)] fn main() { HeightInfo { height: Height::Hygge, epoch: 322_354, bundle: Some(ActorBundleInfo { manifest: Cid::try_from("bafy2bzaced25ta3j6ygs34roprilbtb3f6mxifyfnm7z7ndquaruxzdq3y7lo").unwrap(), url: Url::parse("https://github.com/filecoin-project/builtin-actors/releases/download/v10.0.0-rc.1/builtin-actors-calibrationnet.car").unwrap() }) }
- Adding the download at the proper height to the
load_bundles
function in the daemon. This step could be potentially done automatically in the future.
#![allow(unused)] fn main() { if epoch < config.chain.epoch(Height::Hygge) { bundles.push(get_actors_bundle(config, Height::Hygge).await?); } }
Implement the migration
The next step is to implement the migration itself. In this guide, we will take
the translate Go code into Rust
approach. It's not the cleanest way to do it,
but it's the easiest. Note that the Forest state migration design is not the
same as the Lotus one (we tend to avoid code duplications), so we must be
careful when translating the code.
Create the migration module
Create the migration module in the state migration crate. A valid approach is just to copy-paste the previous migration module and modify it accordingly. The files that will most likely be present:
mod.rs
: here we bundle our migration modules and export the final migration function,system.rs
: here we define the system actor migration logic which (so far) seems to not change between upgrades,migration.rs
: the heart of the migration. Here we add the migration logic for each actor. Its Go equivalent is the top.go, in case of NV18,verifier.rs
: checks for the migration definition.
We will most likely need as many custom migrators as there are in the Go implementation. In other terms, if you see that the Go migration contains:
eam.go
- Ethereum Account Manager migration,init.go
- Init actor migration,system.go
- System actor migration,
Then our implementation will need to define those as well.
The actual migration
This part will largely depend on the complexity of the network upgrade itself.
The goal is to translate the MigrateStateTree
method from
Go
to the add_nvXX_migrations
in the migration.rs
file. The
add_nvXX_migrations
method is responsible for adding all the migrations that
are needed for the network upgrade and the logic in between. Note that the
Forest version is much simpler as it doesn't contain the migration engine
(implemented in the base module).
The first thing to do is to get the current system actor state and the current manifest. Then we will map the old actor codes to the new ones.
#![allow(unused)] fn main() { let state_tree = StateTree::new_from_root(store.clone(), state)?; let system_actor = state_tree .get_actor(&Address::new_id(0))? .ok_or_else(|| anyhow!("system actor not found"))?; let system_actor_state = store .get_obj::<SystemStateV10>(&system_actor.state)? .ok_or_else(|| anyhow!("system actor state not found"))?; let current_manifest_data = system_actor_state.builtin_actors; let current_manifest = Manifest::load(&store, ¤t_manifest_data, 1)?; let (version, new_manifest_data): (u32, Cid) = store .get_cbor(new_manifest)? .ok_or_else(|| anyhow!("new manifest not found"))?; let new_manifest = Manifest::load(&store, &new_manifest_data, version)?; }
⚠️ Stay vigilant! The StateTree
versioning is independent of the network and
actor versioning. At the time of writing, the following holds:
StateTreeVersion0
- Actors version < v2StateTreeVersion1
- Actors version v2StateTreeVersion2
- Actors version v3StateTreeVersion3
- Actors version v4StateTreeVersion4
- Actors version v5 up to v9StateTreeVersion5
- Actors version v10 and above These are not compatible with each other and when using a new FVM, we can only use the latest one.
For actors that don't need any state migration, we can use the nil_migrator
.
#![allow(unused)] fn main() { current_manifest.builtin_actor_codes().for_each(|code| { let id = current_manifest.id_by_code(code); let new_code = new_manifest.code_by_id(id).unwrap(); self.add_migrator(*code, nil_migrator(*new_code)); }); For each actor with non-trivial migration logic, we add the migration function. For example, for the `init` actor, we have: ```rust self.add_migrator( *current_manifest.get_init_code(), init::init_migrator(*new_manifest.get_init_code()), ); }
and we define the init_migrator
in a separate module. This logic may include
setting some defaults on the new fields, changing the current ones to an
upgraded version and so on.
Verifier
An optional (but recommended) piece of code that performs some sanity checks on the state migration definition. At the time of writing, it checks that all builtin actors are assigned a migration function.
#![allow(unused)] fn main() { let verifier = Arc::new(Verifier::default()); }
Post-migration actions
Some code, like creating an entirely new actor (in the case of NV18 creating EAM and Ethereum Account actors), needs to be executed post-migration. This is done in the post-migration actions.
#![allow(unused)] fn main() { let post_migration_actions = [create_eam_actor, create_eth_account_actor] .into_iter() .map(|action| Arc::new(action) as PostMigrationAction<DB>) .collect(); }
Creating the migration object and running it
We take all the migrations that we have defined previously, all the post-migration actions, and create the migration object.
#![allow(unused)] fn main() { let mut migration = StateMigration::<DB>::new(Some(verifier), post_migration_actions); migration.add_nv18_migrations(blockstore.clone(), state, &new_manifest_cid)?; let actors_in = StateTree::new_from_root(blockstore.clone(), state)?; let actors_out = StateTree::new(blockstore.clone(), StateTreeVersion::V5)?; let new_state = migration.migrate_state_tree(blockstore.clone(), epoch, actors_in, actors_out)?; Ok(new_state) }
The new state is the result of the migration.
Use the migration
After completing the migration, we need to invoke it at the proper height. This
is done in the handle_state_migrations
method in the
state manager.
This step could be potentially done automatically in the future.
Testing
We currently lack a framework for properly testing the network upgrades before they actually happen. This should change in the future.
For now, we can do it using a snapshot generated after the network upgrade, e.g., 100 epochs after and validating previous epochs which should include the upgrade height.
forest --chain calibnet --encrypt-keystore false --halt-after-import --height=-200 --import-snapshot <SNAPSHOT>
Future considerations
- Testing without the need for a snapshot or a running node. This would allow us to test the network upgrade in a more isolated way. See how it is done in the Go library.
- Grab the actor bundles from the IPFS. This would make Forest less dependent on the Github infrastructure. Issue #2765
- Consider pre-migrations as Lotus does. It is not needed at the moment (the mainnet upgrade takes several seconds at most) but may become a bottleneck if the migration is too heavy.
State migration spike 🛂
What is state migration?
State migration is a process where the StateTree
contents are transformed from
an older form to a newer form. Certain Actors may need to be created or migrated
as well.
Why do we need to migrate?
Migration is required when the structure of the state changes. This happens when new fields are added or existing ones are modified. Migration is not required in case of new behaviour.
In case of NV18, the StateTree
changed from version 4 to version 5. See
https://github.com/filecoin-project/ref-fvm/pull/1062
What to upgrade?
We need to upgrade the StateTree
which is represented as
HAMT<Cid, ActorState>
to the latest version.
On top of that, we need to migrate certain actors. In the case of NV18 upgrade, it's the init and system actor. EAM actor needs to be created.
When to upgrade?
There is a separate upgrade schedule for each network. In Lotus, it is defined in upgrades.go. In Venus, in fork.go which has the same structure.
For the case of NV18, it is defined as
Height: build.UpgradeHyggeHeight,
Network: network.Version18,
Migration: UpgradeActorsV10,
PreMigrations: []stmgr.PreMigration{{
PreMigration: PreUpgradeActorsV10,
StartWithin: 60,
DontStartWithin: 10,
StopWithin: 5,
}},
Expensive: true,
How to upgrade?
Iterate over the state of each actor at the given epoch and write the new state along with any specific changes to the respective state. This involves iterating over each of the HAMT nodes storing the state and writing them to the database.
Lotus upgrade method
and the
module
dedicated to Actors v10
migration. The core logic is
here.
The same module is used by Venus.
Forks migrations: handled by fork.go entities.
Where to upgrade?
It should be done most likely in the apply blocks method.
// handle state forks
// XXX: The state tree
pstate, err = sm.HandleStateForks(ctx, pstate, i, em, ts)
if err != nil {
return cid.Undef, cid.Undef, xerrors.Errorf("error handling state forks: %w", err)
}
In Forest we already have a hint from the past:
#![allow(unused)] fn main() { if epoch_i == turbo_height { todo!("cannot migrate state when using FVM - see https://github.com/ChainSafe/forest/issues/1454 for updates"); } }
We can try with something simplistic to get it running, it's not an issue. Afterwards we can implement a proper schedule with functors.
Challenges
- Doing the state migration efficiently; we need to traverse every entry in the state trie. Lotus does pre-migration which are filling relevant caches to speed up the eventual full migration at the upgrade epoch. We might need to do something like this as well; it might not be necessary for the first iteration - depends on how performant the migration process would be in the Forest itself.
- No test network. While we can use existing snapshots from before the upgrade to test state migration, it is not sustainable if we want to continuously support calibration network. We either require a local devnet for testing migration before they actually happen on real networks or we can try supporting more bleeding-edge networks. The former approach is more solid, but the latter might be easier to implement at first (and would give Forest more testnets support which is always welcome).
- There may be forks, so we probably need to keep the pre-migration and post-migration state in two caches for some back and forths. This in Lotus is handled with HandleStateForks.
- For EAM Actor we may need some Ethereum methods we have not yet implemented.
Perhaps what
builtin-actors
andref-fvm
expose will be enough.
Current Forest implementation
For the moment Forest does not support migrations. The code that was meant for this is not used at the moment. Most probably we will be able to utilise it.
Plan
We should start by adding an nv18
to the state migration
crate,
along the lines of the
Go equivalent.
Most likely this would mean adding some missing structures, related to the v10
actors (Ethereum ones).
Then try to plug it in apply_blocks. This may work for calibration network. Afterwards, we will most likely need to iterate to achieve acceptable performance for mainnet. Some ideas on how to achieve this can be taken from Lotus/Venus, e.g., pre-migration caching.
Sources
- Rahul's article: https://hackmd.io/@tbdrqGmwSXiPjxgteK3hMg/r1D6cVM_u
- Lotus codebase - https://github.com/filecoin-project/lotus
- Venus codebase - https://github.com/filecoin-project/venus
Forest Test Plan
Version: 1.0
Author: David Himmelstrup
Date updated: 2022-11-14
Test objective:
The Filecoin specification is complex and changes rapidly over time. To manage this complexity, Forest uses a rigorous testing framework, starting with individual functions and ending with complete end-to-end validation. The goals, in descending order of priority, are:
- Regression detection. If Forest can no longer connect to mainnet or if any of its features break, the development team should be automatically notified.
- No institutional/expert knowledge required. Developers can work on a Forest subsystem without worrying about accidentally breaking a different subsystem.
- Bug identification. If something break, the test data should narrow down the location of the issue.
Scope of testing definition:
Forest testing is multifaceted and layered. The testing pipeline looks like this:
- Unit tests for library functions. Example: parsing a network version fails for garbled input.
- Unit tests for CLI programs. Example:
forest-cli dump
produces a valid configuration. - Property tests. Example:
deserialize ∘ serialize = id
for all custom formats. - Network synchronization. PRs are checked against the calibration network, the main branch is checked against the main network.
- End-to-end feature tests. Example: Network snapshots are generated daily and hosted publicly.
- Link checking. API documentation and markdown files are checked for dead links.
- Spell checking. API documentation is checked for spelling errors and typos.
All testing is automated and there are no additional manual checks required for releases.
Resources / Roles & Responsibilities:
Testing is a team effort and everyone is expected to add unit tests, property tests, or integration tests as part of their PR contributions.
Tools description:
- Bug tracker: https://github.com/ChainSafe/forest/issues
- Test Automation tools: nextest, quickcheck
- Languages: Rust
- CI/CD: GitHub Actions
- Version control: Git
Deliverables:
The only deliverable is a green checkmark. Either all tests pass and a PR may be merged into the main branch or something is not up to spec and the PR is blocked.
Test Environment & CI
Short-running tests are executed via GitHub Actions on Linux and MacOS. Long-running tests are run on dedicated testing servers.
The services on the dedicated servers are described here: https://github.com/ChainSafe/forest-iac
In short, the long-running tests are executed in dockerized environments with some running one per day and some running on every commit to the main Forest branch. At the moment, the tests are run on DigitalOcean but they can be run from anywhere. Feedback is reported to ChainSafe's Slack server and artifacts are uploaded to DigitalOcean Spaces.
Test Data:
No private or confidential data is involved in testing. Everything is public.
Bug template:
Bug report template is available on GitHub: https://github.com/ChainSafe/forest/blob/main/.github/ISSUE_TEMPLATE/bug_report.md
The template is applied automatically when bugs are reported through GitHub.
Risk & Issues:
- We depend on the calibration network for testing. If this network is down, our testing capabilities are degraded.
- We depend on GitHub Actions for testing. If GitHub Action is unavailable, testing will be degraded.
- Testing against mainnet is effective for discovering issues, but not great for identifying root causes. Finding bugs before syncing to mainnet is always to be preferred.
Cleaning
rm -rf ~/.genesis-sectors/ ~/.lotus-local-net/ ~/.lotus-miner-local-net/
Running the node:
export LOTUS_PATH=~/.lotus-local-net
export LOTUS_MINER_PATH=~/.lotus-miner-local-net
export LOTUS_SKIP_GENESIS_CHECK=_yes_
export CGO_CFLAGS_ALLOW="-D__BLST_PORTABLE__"
export CGO_CFLAGS="-D__BLST_PORTABLE__"
make 2k
./lotus fetch-params 2048
./lotus-seed pre-seal --sector-size 2KiB --num-sectors 2
./lotus-seed genesis new localnet.json
./lotus-seed genesis add-miner localnet.json ~/.genesis-sectors/pre-seal-t01000.json
./lotus daemon --lotus-make-genesis=devgen.car --genesis-template=localnet.json --bootstrap=false
# Keep this terminal open
Running the miner:
export LOTUS_PATH=~/.lotus-local-net
export LOTUS_MINER_PATH=~/.lotus-miner-local-net
export LOTUS_SKIP_GENESIS_CHECK=_yes_
export CGO_CFLAGS_ALLOW="-D__BLST_PORTABLE__"
export CGO_CFLAGS="-D__BLST_PORTABLE__"
./lotus wallet import --as-default ~/.genesis-sectors/pre-seal-t01000.key
./lotus-miner init --genesis-miner --actor=t01000 --sector-size=2KiB --pre-sealed-sectors=~/.genesis-sectors --pre-sealed-metadata=~/.genesis-sectors/pre-seal-t01000.json --nosync
./lotus-miner run --nosync
# Keep this terminal open
Helpers:
./lotus-miner info
./lotus-miner sectors list
Send data to miner:
./lotus client query-ask t01000
./lotus client import LICENSE-APACHE
./lotus client deal
./lotus client retrieve [CID from import] test.txt # data has to be on chain first
Get data on chain:
./lotus-miner storage-deals pending-publish --publish-now
./lotus-miner sectors seal 2
./lotus-miner sectors batching precommit --publish-now
./lotus-miner sectors batching commit --publish-now
Retrieve:
./lotus client local
./lotus client retrieve --provider t01000 [CID from import] outputfile.txt