cpp-ethereum uses three databases, all of them are essentially just key-value storages (LevelDB or RocksDB is used depending on build settings). Their physical disk locations are as follows:
- Blocks -
{ETH_DATABASE_DIR}/{GENESIS_HASH}/blocks
- Extras -
{ETH_DATABASE_DIR}/{GENESIS_HASH}/{DATABASE_VERSION}/extras
- State -
{ETH_DATABASE_DIR}/{GENESIS_HASH}/{DATABASE_VERSION}/state
where
{ETH_DATABASE_DIR}
- base cpp-ethereum data directory
{GENESIS_HASH}
- hex representation of first 4 bytes of genesis
block hash (d4e56740
for main net, 41941023
for Ropsten)
{DATABASE_VERSION}
- encoded current version of the database layout
(12041
as of the time of this writing)
The blockchain storage, the only thing stored here is binary (RLP-encoded) data of the blocks.
Every record is:
blockHash => blockData
Low-level access to both Blocks and Extras databases is encapsulated in BlockChain class.
Additional data to support efficient queries to the blockchain data.
To distinguish between the types of records, a byte-long constant is
concatenated to the keys. +
in the following description means this
concatenation.
- For each block stored in Blocks DB, Extras has the following records:
blockHash + ExtraDetails => rlp(number, totalDiffiulty, parentHash, rlp(childrenHashes)) // ExtraDetails = 0 blockHash + ExtraLogBlooms => rlp(blockLogBlooms) // ExtraLogBlooms = 3 blockHash + ExtraReceipts => rlp(receipts) // ExtraReceipts = 4 blockNumber + ExtraBlockHash => blockHash // ExtraBlockHash = 1
For each transaction in the blockchain Extras has the following records:
transactionHash + ExtraTransactionAddress => rlp(blockHash, transactionIndex) // ExtraTransactionAddress = 2
Records storing log blooms for a number of blocks at once have the form:
chunkId + ExtraBlocksBlooms => blooms // ExtraBlocksBlooms = 5
where
chunkId = index * 255 + level
. See comment to BlockChain::blocksBlooms() method for details.Additional records, one instance of each:
"best" => lastBlockHash // best block of the canonical chain "chainStart" => firstBlockHash // used when we don't have the full chain, for example after snapshot import
The data representing the full Ethereum state (i.e. all the accounts). The State data forms a Merkle Patricia Trie and the database stores the nodes of this trie.
- Nodes of the trie for the mapping
sha3(address) => accountData
, where according to Yellow PaperaccountData = rlp(nonce, balance, storageRoot, codeHash)
. - For each account with non-empty storage there is a storage trie with
nodes for the mapping
sha3(key) => value
. - For each account with non-empty code, it is stored separately out of
the tries:
sha3(code) => code
. - For each key of all the tries above the mapping of sha3 hash to its
preimage (address or storage key) is stored:
hash + 255 => preimage
(+
is concatenation).
For the code managing the state see State
class
(also note free function commit
there). Merkle Patricia Trie
implemenation is in
TrieDB.h.
For lower-level code accessing the database itself see
OverlayDB
and
MemoryDB.