-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test(state-sync): shard swap in single shard tracking #12108
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #12108 +/- ##
==========================================
- Coverage 71.54% 71.54% -0.01%
==========================================
Files 819 819
Lines 164720 164720
Branches 164720 164720
==========================================
- Hits 117843 117841 -2
- Misses 41733 41738 +5
+ Partials 5144 5141 -3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
||
def _prepare_cluster(self, with_rpc=False, shuffle_shard_assignment=False): | ||
(node_config_dump, | ||
node_config_sync) = state_sync_lib.get_state_sync_configs_pair( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also enables state dumping to external storage (local filesystem), so we may need a different function to generate state sync configs but without enabling centralized state dump. I guess this is what you want to test right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need the headers to be dumped until we find a way to share them between nodes that track a subset of shards. Same for this question here.
tracked_shards=None) | ||
|
||
# State snapshot is disabled for dumper. We only want to dump the headers. | ||
node_config_dump["store.state_snapshot_enabled"] = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not get it, will the headers still be dumped to and retrieved from centralized storage?
I would expect that in decentralized state sync there is no specific dumper role?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the 2.3 release we will decentralize just the state part distribution, which represents the vast majority of the upload/download traffic for state sync. Subsequently we will address the state headers. So for now we still need the dumper node putting the headers into the central storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just added some comments to understand the config setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @VanBarbascu for getting this set up.
Currently the test is passing even if I break decentralized state sync requests because after some attempts the node falls back to the central storage and gets the part successfully from there. If I disable the download from central storage as well (so that all of state sync is broken) then the test does fail as expected.
If there's an easy way to make the dumper upload only the headers and not the parts, that would be nice in the short term. In the long run once we fully decentralize we could just get rid of the dumper.
This is what I was aiming for when I set the dumper to skip snapshot:
Could there be a conflict between The default value for |
check http://127.0.0.1:3040/debug/pages/epoch_info for the validator assignment rotation.
b2d7a03
to
e2ba7cb
Compare
Done! Now it should not dump the parts. |
Pytest to check decentralised state sync of nodes tracking one shard.
Keep shard shuffling off until the implementation is done.
This test sets a dumper node only for the state sync headers.
Sets 4 validator nodes, each tracking 1 shard.
Sets an RPC node to handle the random traffic that changes the state.
Only allow validator nodes to share parts by enabling the state snapshot.
Check if the network can got for 6 epochs while shard shuffling is on.
Validator nodes are expected to download parts from each other.
Check http://127.0.0.1:3040/debug/pages/epoch_info for the validator assignment rotation.