Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

El architecture update - reth #242

Merged
merged 39 commits into from
May 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
8255738
RLP wiki page - first commit
thogiti Apr 28, 2024
441c9c1
data serialization section is added
thogiti Apr 28, 2024
accb498
need for RLP in Ethereum section is addecd
thogiti Apr 28, 2024
97cc262
Fix overview image
SiddharthV1 Apr 28, 2024
56abb2e
RL encoding/decoding sections are added
thogiti Apr 29, 2024
8bc7f0b
RLP tools and resources sections are added.
thogiti Apr 29, 2024
405d9d1
fixed the sidebar conflicts
thogiti Apr 29, 2024
9a1c992
fixed the sidebar conflicts
thogiti Apr 29, 2024
640266b
fixed the sidebar conflicts
thogiti Apr 29, 2024
0e1b1a5
fixed the sidebar conflicts
thogiti Apr 29, 2024
71c7366
fixed the sidebar conflicts
thogiti Apr 29, 2024
cdf2dd4
fixed the sidebar conflicts
thogiti Apr 29, 2024
54a8009
Merge remote-tracking branch 'upstream/el-architecture' into el-archi…
thogiti Apr 29, 2024
84325ab
RLP links to sidebar is added
thogiti Apr 29, 2024
14ddf02
Merge pull request #229 from thogiti/el-architecture
thogiti Apr 29, 2024
e3016a5
typos for wordlist is fixed
thogiti Apr 29, 2024
e6be715
Merge pull request #233 from thogiti/el-architecture
thogiti Apr 29, 2024
ee87d71
typos for wordlist is fixed
thogiti Apr 29, 2024
08e371b
Merge pull request #234 from thogiti/el-architecture
thogiti Apr 29, 2024
f2af327
Update docs/wiki/EL/RLP.md
thogiti Apr 29, 2024
14e6373
Update docs/wiki/EL/RLP.md
thogiti Apr 29, 2024
bfd0299
Updated after the review
thogiti Apr 29, 2024
9b89818
Merge branch 'el-architecture' into el-architecture
thogiti Apr 29, 2024
d6387a4
Merge pull request #235 from thogiti/el-architecture
thogiti Apr 29, 2024
6d8749e
resolve conflict
thogiti Apr 29, 2024
41c0970
Link RLP page in architecture doc
SiddharthV1 Apr 29, 2024
5fc0944
Cross-links & move RLP heading under DataStructure
SiddharthV1 Apr 29, 2024
bf37786
additions to the intro
taxmeifyoucan Apr 29, 2024
45233cd
resolve conflicts
thogiti Apr 29, 2024
dd63838
Merge pull request #238 from thogiti/el-architecture
thogiti Apr 29, 2024
d8a3fba
Reth : added codecs, DB abstractions, cursor
SiddharthV1 Apr 30, 2024
8ca7fec
..
SiddharthV1 Apr 30, 2024
52c94b3
Typo & Phrasing improvement
SiddharthV1 Apr 30, 2024
a2a7970
Add Reth's tables section from week 7
SiddharthV1 May 1, 2024
b31c39b
Merge branch 'main' into el-architecture
SiddharthV1 May 1, 2024
b22a7ff
Typos and wordlist
SiddharthV1 May 1, 2024
b8b803b
Typos and wordlist
SiddharthV1 May 1, 2024
1cd0d9a
Merge branch 'main' into el-architecture
taxmeifyoucan May 1, 2024
0620e06
Update wordlist
taxmeifyoucan May 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 48 additions & 1 deletion docs/wiki/EL/clients/reth.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,55 @@ The image represents a rough component flow of Reth's architecture:
- **BlockchainTree**: When we are nearing the end of the chain during the syncing process, we transition to the blockchain tree. The synchronization occurs close to the tip, when state root validation and execution take place in memory.
- **Database**: When a block gets canonicalized, it is moved to the database
- **Provider**: An abstraction over database that provides utility functions to help us avoid directly accessing the keys and values of the underlying database.
- **Downloader**: Retrieves blocks and headers using peer-to-peer(P2P) networks. This tool is utilized by the pipeline during its initial two stages and by the engine in the event that it need to bridge the gap at the tip.
- **Downloader**: Retrieves blocks and headers using peer-to-peer (P2P) networks. This tool is utilized by the pipeline during its initial two stages and by the engine in the event that it need to bridge the gap at the tip.
- **P2P**: When we approach the tip, we transfer the transactions we have read over P2P to the transaction pool.
- **Transaction Pool**: Includes DDoS mitigation measures. Consists of transactions arranged in ascending order based on the gas price preferred by the users.
- **Payload Builder**: Extracts the initial n transactions in order to construct a fresh payload.
- **Pruner**: Allows us to have a full node.Once the block has been canonicalized by the blockchain tree, we must wait for an additional 64 blocks for it to reach finalization. Once the finalization process is complete, we can be certain that the block will not undergo reorganization. Therefore, if we are operating a full node, we have the option to eliminate the old block using the pruner.

## Storage

Reth primarily utilizes the mdbx database. In addition, it offers several valuable abstractions that enhance its underlying database by enabling data transformation, compression, iteration, writing, and querying functionalities. These abstractions are designed to allow reth the option to change its underlying DB, mdbx, with minimal modifications to the existing storage abstractions.

**Codecs**

This [crate](https://github.com/paradigmxyz/reth/tree/main/crates/storage/codecs) enables the creation of diverse codecs for various purposes. The primary codec utilized in this context is the [Compact trait](https://github.com/paradigmxyz/reth/blob/6d7cd53ad25f0b79c89fd60a4db2a0f2fe097efe/crates/storage/codecs/src/lib.rs#L43), which enables the compression of data, such as unsigned integers by compressing their leading zeros, as well as structures such as access-lists, headers etc.

**DB Abstractions**

The [database trait](https://github.com/paradigmxyz/reth/blob/e158542d31bf576e8a6b6e61337b62f9839734cf/crates/storage/db/src/abstraction/database.rs#L12) is the fundamental abstraction that provides either read only or read/write access to transactions in the low-level database.

The [cursor](https://github.com/paradigmxyz/reth/blob/e158542d31bf576e8a6b6e61337b62f9839734cf/crates/storage/db/src/abstraction/cursor.rs#L13) enables iteration over the values in the database and offers a swift method for retrieving transactions or blocks. It is particularly useful when calculating merkle roots, as sequential value access is significantly faster than random seeking. In addition, if we have a large amount of data to write, sorting and writing it is much faster. The cursor allows us to optimize our approach by providing convenient functions for writing either sorted or unsorted data.

**Tables**

| Table | Key | Value | Description |
| -------------------------- | ----------------------------------- | ----------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| CanonicalHeaders | BlockNumber | HeaderHash | Stores block number indexed by header hash |
| HeaderTerminalDifficulties | BlockNumber | CompactU256 | Is responsible for storing the total difficulty value obtained from a block header. Although it is commonly employed in proof-of-work systems, it is currently not in use. |
| HeaderNumbers | BlockHash | BlockNumber | This is a utility table, it stores block number associated with a header. |
| Headers | BlockNumber | Header | Stores header bodies. |
| BlockBodyIndices | BlockNumber | StoredBlockBodyIndices | Stores block indices that contains indexes of transaction and the count of them. This allows us to determine which transaction numbers are included in the block. |
| BlockOmmers | BlockNumber | StoredBlockOmmers | Stores the uncles/ommers of the block, which are the side blocks that got included (used in proof-of-work) |
| BlockWithdrawals | BlockNumber | StoredBlockWithdrawals | Stores the block withdrawals. |
| Transactions | TxNumber | TransactionSignedNoHash | Here the transaction body is stored indexed by the ordinary transaction number. This information includes the total number of transactions and the number of transactions that were executed. Furthermore, it enables us to effortlessly retrieve a solitary transaction. |
| TransactionHashNumbers | TxHash | TxNumber | Stores the transaction number indexed by the transaction hash. |
| TransactionBlocks | TxNumber | BlockNumber | Stores the mapping of the highest transaction number to the blocks number. Allows us to fetch the block number for a given transaction number. |
| Receipts | TxNumber | Receipt | Stores transaction receipts indexed by transaction number. |
| Bytecodes | B256 | Bytecode | Compiles and stores the bytecode of all smart contracts. There will be multiple accounts with identical bytecode. Therefore, it is necessary to implement a reference counting pointer. |
| PlainAccountState | Address | Account | Stores the current state of an [Account](https://github.com/paradigmxyz/reth/blob/fb960fb3e45e11c24125ccb4bd93f2e2e21ce271/crates/primitives/src/account.rs#L15), the plain state, indexed by the Account address. The plain state is updated during the execution stage. |
| PlainStorageState | Address , SubKey = B256 | StorageEntry | Stores the current value of a storage key and the sub-key is the hash of the storage key. Concerning sub-keys: mdbx allows us to dup table (duplicate values inside tables) which can lead a faster access to some values. |
| AccountsHistory | ShardedKey<Address> | BlockNumberList | Stores pointers to the block changesets that contain modifications for each account key. Each account is associated with a record of modifications, represented as a list of blocks. For example, if we want to retrieve the account balance at block 1 million, we need to determine the next block where the account was modified. If the next modification occurs at block number 1 million and 1, we need to fetch the set of changes for that account from the tables below. |
| |
| StoragesHistory | StorageShardedKey | BlockNumberList | Stores pointers to block number changeset with changes for each storage key. This allows us to index the change sets and find the change that happened in the history |
| AccountChangeSets | BlockNumber, SubKey = Address | AccountBeforeTx | The state of an account is stored prior to any transaction that alters it, such as when the account is created, self-destructed, accessed while empty, or when its balance or nonce is modified. Therefore, for each block number. Therefore, we possess the previous values for each block and account address. |
| StorageChangeSets | BlockNumberAddress , SubKey = B256 | StorageEntry | Preserves the state of a storage prior to a specific transaction altering it. Therefore, for each block number, account address and sub-key as the storage key, we can obtain the previous storage value. The execution stage modifies both this table and the one above it. These tables are used for the merkle trie calculations, which require the values to be incremental. They are also used for any history tracing performed by the JSON-RPC API. |
| HashedAccounts | B256 | Account | Stores the current state of an account indexed by keccak256(Address). This table is in preparation for merkleization and calculation of state root. This and the table below are used by the merkle trie, for the first calculation of the merkle trie we need sorted hashed addresses |
| HashedStorages | B256, SubKey = B256 | StorageEntry | Stores the current storage values indexed by keccak256(Address) and the sub-key as the hash of storage key keccak256(key). Like above useful for merkleization as the hashed addresses/keys are sorted. |
| AccountsTrie | StoredNibbles | StoredBranchNode | Stores the current state's Merkle Patricia Tree. |
| StoragesTrie | B256 , SubKey = StoredNibblesSubKey | StorageTrieEntry | From HashedAddress => NibblesSubKey => Intermediate value. This and the above table stores the nodes needed for merkle trie calculation |
| TransactionSenders | TxNumber | Address | Stores the transaction sender for each transaction. It is needed to speed up execution stage and allows fetching the signer without doing the computationally expensive transaction signer recovery |
| StageCheckpoints | StageId | StageCheckpoint | Stores the highest synced block number and stage-specific checkpoint of each stage. |
| StageCheckpointProgresses | StageId | Vec<u8> | Stores arbitrary data to keep track of a stage first-sync progress. This and the above table allows us to know where the stage stopped and to determine what to do next. |
| PruneCheckpoints | PruneSegment | PruneCheckpoint | Records the maximum pruned block number and the pruning mode for each segment of the pruning process. This enables us to determine the extent to which we have pruned our data, involving the elimination of change sets and their corresponding indexes to eliminate historical data, leaving only the most recent data to be retrieved i.e. fetching the tip. |
| VersionHistory | u64 | ClientVersion | Stores the history of client versions that have accessed the database with write privileges indexed by unix timestamp seconds. |
Loading
Loading