-
Notifications
You must be signed in to change notification settings - Fork 665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(resharding): generic trie update retain, including trie storage #12301
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #12301 +/- ##
==========================================
+ Coverage 71.24% 71.26% +0.01%
==========================================
Files 838 838
Lines 169020 169257 +237
Branches 169020 169257 +237
==========================================
+ Hits 120424 120624 +200
- Misses 43351 43384 +33
- Partials 5245 5249 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
let mut child_memory_usage = 0; | ||
for handle in path.into_iter().rev() { | ||
// First, recompute memory usage, emulating the recursive descent. | ||
let TrieNodeWithSize { node, mut memory_usage } = memory.destroy(handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not have much context, but is it ok that we do memory.destroy(handle)
both here and inside squash_node()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha! Yeah, looking at the code, that's what I thought as well, but seems like we are calling memory.store_at
right after calling destroy which undoes this.
memory.store_at(handle, TrieNodeWithSize { node, memory_usage }); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aah crap, this along with duplication of memory.destroy(handle)
in squash_node
and missing else
case in if key_deleted
completely confused me, but after staring at this for 10 min, I think it makes sense
} | ||
} | ||
|
||
pub type UpdatedMemTrieNodeWithSize = GenericUpdatedTrieNodeWithSize<MemTrieNodeId, FlatStateValue>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirming.... It seems like we don't use the size
parameter for MemTrieNodes anywhere? Could you please confirm this? In which case we can remove this type in favor of GenericUpdatedTrieNode
without size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like, I mean, get rid of type GenericUpdatedTrieNodeWithSize
, only continue to use type GenericUpdatedTrieNode
and implement size
separately for UpdatedTrieStorageNode
i.e. GenericUpdatedTrieNode<TrieStorageNodePtr, ValueHandle>
.
That should make the code easier to reason about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how to do it without adding comparable overhead.
The GenericUpdatedTrieNode
serves as a generic enum. Adding size to it is the same as implementing GenericUpdatedTrieNodeWithSize
.
If I look from GenericTrieUpdate
perspective, both node and memory has to be taken simultaneously if generic_get_node
is called, and "node with size" struct seems to be useful here again.
Could you be more specific on how to get rid of GenericUpdatedTrieNodeWithSize
?
The reason why we need a change in memtries is described in comment to MemTrieUpdate::generic_take_node
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me explore a bit myself on this and then suggest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aah crap, you're right! I was under the impression that memory calculation happens independently at the end of update/insert function, but seems like it's built into the function call. I suppose what we have currently is best even though I believe it's irrelevant for memtries? Or have I got that detail wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spoke offline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope we can get to generic_insert_delete soon!
Please do take a look at the Size
interface and check if we really need it for memtrie side of things?
core/store/src/trie/mem/updating.rs
Outdated
/// An updated node with its memory usage. | ||
/// Needed to recompute subtree function (memory usage) on the fly. | ||
#[derive(Debug, Clone, PartialEq, Eq)] | ||
pub struct GenericUpdatedTrieNodeWithSize<GenericTrieNodePtr, FlatStateValue> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is there anything conceptually linking GenericUpdatedTrieNodeWithSize
with FlatStateValue
(the enum?). I was expecting something like GenericValueHandle
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, overall structure looks great; didn't dig into details too much but I trust testing as well as other people who have reviewed this :)
core/store/src/trie/mem/updating.rs
Outdated
|
||
/// An updated node - a node that will eventually become an in-memory trie node. | ||
/// Trait for trie values to get their length. | ||
pub trait HasLength { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: perhaps name this something a little more specific, like HasValueLength; HasLength is a very generic name when reading the trait bound in another file.
node: GenericNodeOrIndex<GenericTrieNodePtr>, | ||
) -> Result<GenericUpdatedNodeId, StorageError>; | ||
|
||
/// Takes a node from the set of updated nodes, setting it to None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It might help readers to add a high level summary of the slot mechanism to the trait's documentation. This stuff itself may not be so confusing but when I was reading the slot mechanism for the first time before, I was very confused when I also had to understand a bunch of other things in the update logic.
/// | ||
/// Note that it has nothing to do with `TrieUpdate` used for runtime to store | ||
/// temporary state changes (TODO(#12324) - consider renaming it). | ||
pub(crate) trait GenericTrieUpdate<'a, GenericTrieNodePtr, GenericValueHandle> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice comment!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
Starting #12324.
Implement
retain_multi_range
for TrieStorage (partial or disk trie), by introducingGenericTrieUpdate
which works for both trie storage and memtrie.squash_node
GenericTrieUpdate
Some notes on the structure.
GenericTrieUpdate
abstracts the way the updated nodes are structured. It doesn't matter what is the node type - it is abstracted toGenericTrieNodePtr
.FlatStateValue
is the name for the value update. We can migrate everything to actualFlatStateValue
, but for now trie storage usesValueHandle
and I don't want to touch it yet.The methods of
GenericTrieUpdate
define how to take/put nodes to memory.GenericNodeOrIndex
stores a reference to old or updated node. For memtrie old node it isMemTrieNodeId
, for triestorage it isCryptoHash
.Finally, we are able to call
generic_retain_multi_range_recursive
which usesGenericTrieUpdate
and recursively descends to the trie, regardless on how it is stored.If we agree on this, I can proceed later with migrating
insert
anddelete
as well, which will eventually lead to removing copypaste code.