Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[reshardingV3] State ShardUIdMapping #12084

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Conversation

staffik
Copy link
Contributor

@staffik staffik commented Sep 12, 2024

Tracking issue: #12050

I would like to collect an early feedback on the first steps to implementing mapping strategy for State in ReshardingV3.
Went through all references in code to DbCol::State, excluding tests for now.

Update: in-memory mapping

Digging more into the code, it turns out we construct TrieStorage very often (e.g. every time we apply a chunk):

Arc::new(TrieDBStorage::new(self.0.store.clone(), shard_uid))

It can be initialized with a shared reference to TrieCache, that it is kept in ShardTries and protected by mutex:

TrieCachingStorage::new(self.0.store.clone(), cache, shard_uid, is_view, prefetch_api)

I used similar approach by creating StateReader that is kept in ShardTries and used to create each new instance of TrieStorage.

StateReader needs to know resharding tree so that it knows parent shard uid when it cannot find a node in the database.
It keeps the resharding tree as hashmap. Alternatively, it could store ancestors list for each shard, but it would use O(S^2) memory, which is also fine for the near future.
The problem is that the resharding history might not be easily accessible, see #12084 (comment).

What to do?

One idea is to add a db column that stores resharding parent shard_uid and update it on resharding event.
It could be initially empty, since we do not need resharding history from before reshardingV3 lands.
There are many places in code where we have only access to store (not epoch manager) and we need the resharding history, so I think this idea is the way to go.
Also, it would simplify the code as StateReader could be initially empty and would not need to know shard_uids and resharding history upfront.

Old version: keeping the mapping in database:

Summary:

  • Added persistently stored DbCol::ShardUIdMapping.
  • Used MappedShardUId wrapper in places where we expect the mapped value (where shard UID is used as a db key prefix for the State column).
  • Added map_shard_uid to read the mapping from db. For now, it never fails and it falls back to the input shard UID.

Next steps:

  • Handle updating the mapping on a resharding or state sync event.
  • State clean up (e.g. gc parent state when it is no longer referenced by any child).
  • Tests.

@staffik staffik added the A-resharding Area: State resharding label Sep 12, 2024
@staffik staffik changed the title [reshardingV3] [reshardingV3] State ShardUIdMapping Sep 12, 2024
Comment on lines 216 to 221
let (start, end) = subtree_to_load.to_iter_range(self.shard_uid);
let (start, end) = subtree_to_load.to_iter_range(self.shard_uid_db_prefix.0);

// Load all the keys in this range from the FlatState column.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about that. Maybe it would read entire parent shard while we only want child shard.
And it reads FlatState so we might want to synchronize on changes here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agreed, this seems fragile.

Taking a step back what is our plan for loading memtrie post resharding? Perhaps we can rely on that and panic here if shard_uid != mapped_shard_uid.

@staffik
Copy link
Contributor Author

staffik commented Sep 12, 2024

There is

scan_db_column(
    col: &str,
    lower_bound: Option<&[u8]>,
    upper_bound: Option<&[u8]>,
    store: Store)

and it might not be possible to scan child shard only.
It is only a debug tool and we probably would need to live with that.

Copy link

codecov bot commented Sep 12, 2024

Codecov Report

Attention: Patch coverage is 80.22388% with 53 lines in your changes missing coverage. Please review.

Project coverage is 71.54%. Comparing base (3d0fd26) to head (8a4ffea).

Files with missing lines Patch % Lines
core/store/src/trie/state_reader.rs 83.11% 10 Missing and 3 partials ⚠️
nearcore/src/entity_debug.rs 0.00% 9 Missing ⚠️
tools/state-viewer/src/commands.rs 0.00% 7 Missing ⚠️
core/store/src/trie/from_flat.rs 0.00% 5 Missing ⚠️
tools/database/src/analyze_contract_sizes.rs 0.00% 5 Missing ⚠️
tools/database/src/state_perf.rs 0.00% 5 Missing ⚠️
tools/database/src/analyze_delayed_receipt.rs 0.00% 2 Missing ⚠️
tools/fork-network/src/cli.rs 0.00% 2 Missing ⚠️
tools/state-viewer/src/trie_iteration_benchmark.rs 0.00% 2 Missing ⚠️
core/store/src/flat/store_helper.rs 66.66% 1 Missing ⚠️
... and 2 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #12084      +/-   ##
==========================================
+ Coverage   71.51%   71.54%   +0.02%     
==========================================
  Files         818      819       +1     
  Lines      164494   164637     +143     
  Branches   164494   164637     +143     
==========================================
+ Hits       117644   117782     +138     
+ Misses      41713    41712       -1     
- Partials     5137     5143       +6     
Flag Coverage Δ
backward-compatibility 0.17% <0.00%> (-0.01%) ⬇️
db-migration 0.17% <0.00%> (-0.01%) ⬇️
genesis-check 1.26% <0.00%> (-0.01%) ⬇️
integration-tests 38.68% <54.85%> (+0.02%) ⬆️
linux 71.33% <80.22%> (+0.01%) ⬆️
linux-nightly 71.11% <80.22%> (+0.01%) ⬆️
macos 54.14% <74.51%> (+0.27%) ⬆️
pytests 1.52% <0.00%> (-0.01%) ⬇️
sanity-checks 1.33% <0.00%> (-0.01%) ⬇️
unittests 65.26% <76.11%> (+0.01%) ⬆️
upgradability 0.21% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@wacban wacban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I left a few comments!

core/store/src/cold_storage.rs Outdated Show resolved Hide resolved
core/store/src/flat/store_helper.rs Show resolved Hide resolved
Comment on lines 216 to 221
let (start, end) = subtree_to_load.to_iter_range(self.shard_uid);
let (start, end) = subtree_to_load.to_iter_range(self.shard_uid_db_prefix.0);

// Load all the keys in this range from the FlatState column.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, agreed, this seems fragile.

Taking a step back what is our plan for loading memtrie post resharding? Perhaps we can rely on that and panic here if shard_uid != mapped_shard_uid.

core/store/src/trie/shard_tries.rs Outdated Show resolved Hide resolved
core/store/src/trie/shard_tries.rs Outdated Show resolved Hide resolved
Comment on lines 254 to 258
let shard_uid_db_prefix = match self.0.shard_uid_to_db_prefix.get(&shard_uid) {
Some(mapped_shard_uid) => *mapped_shard_uid,
// TODO(reshardingV3) Think about how None should be handled here.
None => shard_uid.into(),
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is on a bit higher level than I anticipated. I thought the mapping would happen closer to the db itself, perhaps in Store or Database. I'm not saying this is wrong but I'm curious on your thoughts about how those two approaches compare.

core/store/src/trie/trie_storage.rs Outdated Show resolved Hide resolved
Comment on lines 570 to 571
// TODO(reshardingV3) Think about how to handle it the best way
_ => shard_uid,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 best to think about what are the invariants and handle it accordingly.

If the invariant is that the mapping should always be populated for all shards then returning an error seems quite reasonable.

core/store/src/trie/trie_storage.rs Outdated Show resolved Hide resolved
@@ -101,10 +102,14 @@ impl NightshadeRuntime {
let trie_viewer = TrieViewer::new(trie_viewer_state_size_limit, max_gas_burnt_view);
let flat_storage_manager = FlatStorageManager::new(store.clone());
let shard_uids: Vec<_> = genesis_config.shard_layout.shard_uids().collect();
// TODO(reshardingV3) Recursively calculate resharding parents for `shard_uids`.
Copy link
Contributor Author

@staffik staffik Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would require iterating through previous epochs, which is a bit cumbersome, see

// and (2) it is not easy to walk backwards from the last epoch; there's no
// "give me the previous epoch" query. So instead, we use block header's
// `next_epoch_id` to establish an epoch chain.
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-resharding Area: State resharding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants