feat: wiring for bandwidth scheduler #12234

jancionear · 2024-10-16T13:49:30Z

Add the wiring needed for the bandwidth scheduler algorithm.

Changes:

Add a new ProtocolFeature - BandwidthScheduler, its protocol version is set to nightly
Add a struct that will keep the bandwidth requests generated by the shards
Propagate the bandwidth requests through the blockchain - put the generated bandwidth requests in the shard headers, pass the previous bandwidth requests to the runtime
Add a struct that represents the bandwidth scheduler state, it's stored in the trie and modified on every scheduler invocation.
Mock implementation of bandwidth scheduler - it takes the previous bandwidth requests and the state and mocks the scheduler algorithm. It activates the requests propagation logic and breaks some tests.

Propagation of bandwidth requests

The flow of bandwidth requests looks as follows:

A chunk is applied and generates bandwidth requests. They are put in ApplyResult and ApplyChunkResult
The requests are taken from the apply result and put in ChunkExtra. ChunkExtra is persisted in the database
During chunk production, Client fetches ChunkExtra of the previous chunk and puts the bandwidth requests in chunk header
The produced chunks are included in the block
The new chunks are applied, their ApplyState contains bandwidth requests taken from all the chunk headers in the block that contains the applied chunks.
During the application, bandwidth scheduler looks at the requests created at the previous height and grants banwidth
Receipts are sent out
Then the chunk generates new bandwidth requests
etc

The flow is very similar to the one for congestion info.

Scheduler state

Bandwidth scheduler needs to keep some persistent state. In the future it'll be something like "how much every shard was granted lately", it'll be used to maintain fairness. For now it's just mock data.
Scheduler state should always be the same on all shards. All shards start with the same scheduler state, apply the scheduler at the same heights with the same inputs and always end up with the same scheduler state.
This means that the bandwidth scheduler also needs to be run for missing chunks. Luckily that can be easily achieved thanks to existing apply_old_chunk infrastructure (all missing chunk are applied, it counts as "implicit state transitions").
The state_root will now change at every height, even when there are no receipts to be processed. It breaks some tests which assumed that the state root wouldn't change.

The pull request is meant to be reviewed commit-by-commit, I tried to make the commit history nice.

Add the structs which will be used to represent bandwidth requests generated by a shard. For now the BandwidthRequest doesn't have the requested values, they will be added later.

Chunk headers will contain bandwidth requests generated during the previous chunk application. We will collect bandwidth requests from all the shards and use them in bandwidth scheduler during chunk application.

Bandwidth requests will be generated during chunk application and then they'll be available in the ApplyResult.

Result of chunk application should keep the generated bandwidth requests.

ChunkExtra stores the results of chunk application in a persistent way. Let's put the generated bandwidth requests there and then fetch them when producing the next chunk.

Collect the bandwith requests generated by all shards at the previous height and expose them to the runtime. Runtime needs to have the requests, as they're needed to run the bandwidth scheduler.

Add a struct that keeps the persistent state used by the bandwidth scheduler.

Add a mock implementation of the bandwidth scheduler algorithm. Bandwidth scheduler takes the current state and previous bandwidth requests and generates bandwidth grants. The mock implementation takes the inputs and generates deterministic state changes based on them, but it doesn't generate the bandwidth grants yet. The mock implementation is enough to activate the logic that propagates bandwidth requests throughout the blockchain and break some tests.

This test assumed that the state root doesn't change when there are no receipts, but this is no longer true. Bandwidth scheduler modifies the state on every height, so now the state root changes every time.

state_viewer::apply_chunk has the ability to apply a chunk when the block that contains the chunk isn't available. Initially I passed empty bandwidth requests in the ApplyState, as usually they're taken from the chunk headers in the block that contains the applied chunk, and this block isn't available here. But that breaks test_apply_chunk - a test which applies a chunk in a normal way and compares the result with a chunk that was applied without providing the block. It expects the state roots to be the same, but that's not the case because the bandwidth requests are different and bandwidth scheduler produces different state. To deal with this we can try to fetch the original bandwidth requests from the chunk extra of the previous chunks. It's an opportunistic reconstruction - if the chunk extra is available it adds the requests to the apply_state, if not it leaves them empty. This is enough to fix the state root mismatch.

This tests creates a situation where the last few chunks in an epoch are missing. It performs state sync, then takes the state root of the first missing chunk on one node and expects the state roots of all the missing chunks on the other node to be the same as that first state root. This breaks because bandwidth scheduler changes the state at every height - even for missing chunks - so the state root for later missing chunks is not the same as the state root of the first missing chunk. Fix the problem by comparing state roots of missing chunks at the same heights.

This test performs state sync and then does a function call on the synced node to test that the sync worked. The function call at the end of the test started failing with `MissingTrieValue`. I suspect that the function call is done with the wrong state root - it worked previously, when all the state roots were the same, as the chunks don't have any transactions, but broke when bandwidth scheduler started changing the state at every height. The `MissingTrieValue` error stops occuring when the state root is taken from the previous block. My understanding of state sync isn't very good, but I think this theory makes sense.

Add an extra check to ensure that the scheduler state stays the same on all shards.

codecov · 2024-10-16T14:19:24Z

Codecov Report

Attention: Patch coverage is 84.94624% with 70 lines in your changes missing coverage. Please review.

Project coverage is 71.56%. Comparing base (cd319ac) to head (ed32376).
Report is 3 commits behind head on master.

Files with missing lines	Patch %	Lines
core/primitives/src/views.rs	73.50%	31 Missing ⚠️
chain/chain/src/validate.rs	54.83%	13 Missing and 1 partial ⚠️
runtime/runtime/src/bandwidth_scheduler/mod.rs	90.00%	5 Missing and 2 partials ⚠️
core/primitives/src/bandwidth_scheduler.rs	84.00%	4 Missing ⚠️
chain/rosetta-rpc/src/adapters/transactions.rs	0.00%	3 Missing ⚠️
tools/state-viewer/src/util.rs	40.00%	3 Missing ⚠️
core/primitives/src/types.rs	93.33%	2 Missing ⚠️
tools/state-viewer/src/apply_chunk.rs	86.66%	1 Missing and 1 partial ⚠️
chain/chain-primitives/src/error.rs	0.00%	1 Missing ⚠️
..._validation/chunk_validator/orphan_witness_pool.rs	50.00%	1 Missing ⚠️
... and 2 more

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #12234      +/-   ##
==========================================
+ Coverage   71.55%   71.56%   +0.01%     
==========================================
  Files         836      838       +2     
  Lines      168170   168683     +513     
  Branches   168170   168683     +513     
==========================================
+ Hits       120335   120724     +389     
- Misses      42585    42705     +120     
- Partials     5250     5254       +4

Flag	Coverage Δ
backward-compatibility	`0.16% <0.00%> (-0.01%)`	⬇️
db-migration	`0.16% <0.00%> (-0.01%)`	⬇️
genesis-check	`1.23% <0.00%> (-0.01%)`	⬇️
integration-tests	`38.93% <76.55%> (+0.09%)`	⬆️
linux	`71.10% <48.17%> (-0.10%)`	⬇️
linux-nightly	`71.13% <78.70%> (+<0.01%)`	⬆️
macos	`54.18% <45.29%> (-0.13%)`	⬇️
pytests	`1.55% <0.00%> (-0.01%)`	⬇️
sanity-checks	`1.35% <0.00%> (-0.01%)`	⬇️
unittests	`65.34% <82.97%> (+0.02%)`	⬆️
upgradability	`0.21% <0.00%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

… is enabled All chunks produced in the protocol version where bandwidth scheduler is enabled should use ShardChunkHeaderInner::V4, I missed this in the previous commit.

The test iterates over all items in the trie and creates a StateRecord for each of them. The problem is that some types of trie entries don't have a corresponding StateRecord variant. For example outgoing buffers, yield resume data, and bandwidth scheduler state can't be made into a StateRecord. The test started failing because it tries to unwrap result of `StateRecord::from_raw_key_value` for a trie entry that represents BandwidthSchedulerState. The function returns None and the unwrap panics. Fix the problem by removing the unwrap and instead looking for `Some` value. The test only looks for one type of StateRecord, it doesn't matter if it skips over the scheduler state.

…ate missing chunks

cargo fmt went wild again 0_o

wacban

LGTM

wacban · 2024-10-23T14:21:24Z

runtime/runtime/src/lib.rs

+            bandwidth_requests,
            contract_accesses,
+            bandwidth_scheduler_state_hash: bandwidth_scheduler_output
+                .as_ref()
+                .map(|o| o.scheduler_state_hash)
+                .unwrap_or_default(),


nit: fix the ordering

I think it's fixed now, the master merge fiddled with it.

wacban · 2024-10-23T15:33:01Z

core/primitives/src/trie_key.rs

@@ -56,6 +56,7 @@ pub mod col {
    /// backpressure on the receiving shard.
    /// (`primitives::receipt::Receipt`).
    pub const BUFFERED_RECEIPT: u8 = 14;
+    pub const BANDWIDTH_SCHEDULER_STATE: u8 = 15;


Ah sorry, I mixed that up with nibbles. You're right, we should be just fine. That's awesome :)

jancionear · 2024-10-23T17:43:45Z

core/primitives/src/sharding.rs

+            ShardChunkHeader::V3(header) => header.inner.bandwidth_requests(),
+        }
+    }
+
    /// Returns whether the header is valid for given `ProtocolVersion`.
    pub fn valid_for(&self, version: ProtocolVersion) -> bool {


Reminder to myself - make sure that we check this in witness validation. Didn't matter before, but might matter now.

State root now changes at every height because of bandwidth scheduler, it doesn't make sense to check the equality.

This reverts commit 15336ba. That wasn't meant to go into the PR.

This reverts commit c4fde1a.

This reverts commit fa002ab. I didn't mean to revert this ;-;

jancionear · 2024-10-24T19:42:14Z

🎉

@wacban

…control (#12307) During review of bandwidth scheduler code, @wacban mentioned that he'd prefer the header upgrade to be done the same way as it was done for congestion control [ref](#12234 (comment)), but I wasn't convinced if it's really cleaner. In this PR I modified the header upgrade to work the same way as it does in congestion control. We can compare the two approaches and choose the better one. The current approach looks like this: * Before protocol upgrade to `BandwidthScheduler` version all chunks use `InnerV3`, which doesn't have bandwidth requests in it. * After the protocol version upgrade all newly produced chunks should have `InnerV4`. Application of the last chunk of the previous protocol version will produce a `ChunkExtra` which doesn't have bandwidth requests (they are set to `None`), so the bandwidth requests in `InnerV4` of the first chunk are set to the default value. Bandwidth requests in `InnerV4` are not an `Option`, so we can't set them to `None`. * After the first chunk all produced `ChunkExtras` will have `bandwidth_requests` set to `Some`, and they'll be put inside `InnerV4` * `validate_chunk_with_chunk_extra_and_receipts_root` needs to be aware of what happens at the first block and allow situations where the bandwidth requests in `ChunkExtra` are `None`, but they're `Some(Default::default())` in the chunk header. The congestion-control-like approach looks like this: * Before protocol upgrade to `BandwidthScheduler` version all chunks use `InnerV3`, which doesn't have bandwidth requests in it. * The first chunk after the upgrade will still use `InnerV3` because the `bandwidth_requests` in `ChunkExtra` are None. * For future chunks the `bandwidth_requests` in `ChunkExtra` will be `Some` and all chunks will use `InnerV4` with the requests. * `validate_chunk_with_chunk_extra_and_receipts_root` can do a direct comparison between the bandwidth requests in chunk extra and chunk header. In the current approach I like the exactness - all chunk headers produced with the new protocol version have a new version of `Inner`. We don't allow multiple header versions in one protocol version. The only problem is that we need to have the special corner case check in `validate_chunk_with_chunk_extra_and_receipts_root` . I think we also need to make sure that we validate the header versions for endorsed chunks, using `is_valid_for` or something like that. The congestion-control-like approach doesn't have the weird corner case, which is nice. It also doesn't require such strict validation of header version - headers with wrong version will get rejected by chunk extra validation because of None/Some difference. But it's much less exact. We allow multiple inner versions for one protocol version, and I find that much harder to reason about. I'm not sure what happens at genesis chunks, it looks like we set the congestion infos to None, but that means that genesis chunks would always have `InnerV2`, which would get upgraded to `Inner<latest>` on the first chunk. weird. I changed them to `Some(CongestionInfo::default())`, I think that makes things a bit better, as now the chain starts with the current version of `Inner`. Yet another approach would be to make bandwidth requests an `Option` in `InnerV4`. They would be `None` on the first chunk and `Some` on the next chunks. We could directly compare that with the requests in `ChunkExtra`. But it's a bit sad that we'd have an `Option ` for something that's supposed to always be there :/

jancionear added 17 commits October 16, 2024 11:20

Add ProtocolFeature::BandwidthScheduler

e782412

Add BandwidthRequest and BandwidthRequests structs

3e22208

Add the structs which will be used to represent bandwidth requests generated by a shard. For now the BandwidthRequest doesn't have the requested values, they will be added later.

Add bandwidth requests to chunk header

5614b06

Chunk headers will contain bandwidth requests generated during the previous chunk application. We will collect bandwidth requests from all the shards and use them in bandwidth scheduler during chunk application.

Add bandwidth requests to ApplyResult

a217022

Bandwidth requests will be generated during chunk application and then they'll be available in the ApplyResult.

Add banwidth requests to ApplyChunkResult

dc156ac

Result of chunk application should keep the generated bandwidth requests.

Add bandwidth requests to ChunkExtra

24dd104

ChunkExtra stores the results of chunk application in a persistent way. Let's put the generated bandwidth requests there and then fetch them when producing the next chunk.

Add bandwidth requests to ApplyState

c9edcb9

Collect the bandwith requests generated by all shards at the previous height and expose them to the runtime. Runtime needs to have the requests, as they're needed to run the bandwidth scheduler.

Add bandwidth scheduler state

66103e5

Add a struct that keeps the persistent state used by the bandwidth scheduler.

Propagate previous bandwidth requests for missing chunks

40db96e

Fix test_empty_apply

db96527

This test assumed that the state root doesn't change when there are no receipts, but this is no longer true. Bandwidth scheduler modifies the state on every height, so now the state root changes every time.

Fix test_flat_storage_iter

dc1cd5f

Fix test_archival_save_trie_changes

d7bfc1e

Add a sanity check for scheduler state

3e63448

Add an extra check to ensure that the scheduler state stays the same on all shards.

jancionear requested a review from wacban October 16, 2024 13:49

jancionear requested a review from a team as a code owner October 16, 2024 13:49

Fix python formatting

caed220

jancionear added 9 commits October 16, 2024 15:43

Merge branch 'master' into bandsim-wires

1a27050

Don't allow ShardChunkHeaderInner::V3 when BandwidthScheduler feature…

54ff731

… is enabled All chunks produced in the protocol version where bandwidth scheduler is enabled should use ShardChunkHeaderInner::V4, I missed this in the previous commit.

Validate that bandwidth requests in chunk extra match the chunk header

dfb11b6

Merge branch 'master' into bandsim-wires

272d486

Merge branch 'master' into bandsim-wires

ccec51a

fix(test_loop) - drop endorsements instead of partial chunks to simul…

a0ea12d

…ate missing chunks

Add a testloop protocol upgrade test

ca1ebd6

Test protocol upgrade to version with bandwidth scheduler

e6bd44e

Add a commit about BandwidthSchedulerStateUpdate

42eae3b

cargo fmt went wild again 0_o

wacban approved these changes Oct 23, 2024

View reviewed changes

jancionear added this pull request to the merge queue Oct 23, 2024

jancionear commented Oct 23, 2024

View reviewed changes

check that boxing a struct doesn't change serialization format

15336ba

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 23, 2024

jancionear added 5 commits October 24, 2024 13:52

Fix python serialization, forgot a square bracket

5d921fa

Don't check state root equality in two more tests.

02d8ccb

State root now changes at every height because of bandwidth scheduler, it doesn't make sense to check the equality.

Merge branch 'master' into bandsim-wires

6241590

Fix - use MaybeNew enum when iterating over previous chunks

8b456a2

Revert "check that boxing a struct doesn't change serialization format"

aca486c

This reverts commit 15336ba. That wasn't meant to go into the PR.

jancionear enabled auto-merge October 24, 2024 15:12

jancionear added this pull request to the merge queue Oct 24, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 24, 2024

jancionear added 3 commits October 24, 2024 16:25

Revert "Box structs that are too large"

fa002ab

This reverts commit c4fde1a.

Fix python serialization schema

5a251c9

Reapply "Box structs that are too large"

86cec08

This reverts commit fa002ab. I didn't mean to revert this ;-;

jancionear added this pull request to the merge queue Oct 24, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 24, 2024

Fix python deserialization again

ed32376

jancionear enabled auto-merge October 24, 2024 17:51

jancionear added this pull request to the merge queue Oct 24, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 24, 2024

Longarithm added this pull request to the merge queue Oct 24, 2024

Merged via the queue into master with commit ab75966 Oct 24, 2024
29 checks passed

Longarithm deleted the bandsim-wires branch October 24, 2024 19:39

jancionear mentioned this pull request Oct 24, 2024

Do bandwidth scheduler header upgrade the same way as for congestion control #12307

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: wiring for bandwidth scheduler #12234

feat: wiring for bandwidth scheduler #12234

jancionear commented Oct 16, 2024 •

edited

Loading

codecov bot commented Oct 16, 2024 •

edited

Loading

wacban left a comment

wacban Oct 23, 2024

jancionear Oct 23, 2024

wacban Oct 23, 2024

jancionear Oct 23, 2024

jancionear commented Oct 24, 2024

feat: wiring for bandwidth scheduler #12234

feat: wiring for bandwidth scheduler #12234

Conversation

jancionear commented Oct 16, 2024 • edited Loading

Propagation of bandwidth requests

Scheduler state

codecov bot commented Oct 16, 2024 • edited Loading

Codecov Report

wacban left a comment

Choose a reason for hiding this comment

wacban Oct 23, 2024

Choose a reason for hiding this comment

jancionear Oct 23, 2024

Choose a reason for hiding this comment

wacban Oct 23, 2024

Choose a reason for hiding this comment

jancionear Oct 23, 2024

Choose a reason for hiding this comment

jancionear commented Oct 24, 2024

jancionear commented Oct 16, 2024 •

edited

Loading

codecov bot commented Oct 16, 2024 •

edited

Loading