Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(config): add config for dynamic gas_limit adjustment #11863

Closed
wants to merge 14 commits into from

Conversation

mooori
Copy link
Contributor

@mooori mooori commented Jul 31, 2024

Adds the configuration for dynamic gas limit adjustments. See #11460 and this Zulip thread for reference.

Overview

  • No behavior changes. This PR just adds Option<GasLimitAdjustmentConfig> to RuntimeConfig. By default this is None, which is consistent with previous behavior.
  • Gas limit adjustments at runtime are intended for tests and benchmarks only. To enable it, construct a RuntimeConfig with Some(GasLimitAdjustmentConfig).
  • The logic which actually uses this config will be added in a separate PR. I hope splitting it up like this makes reviews easier.
  • This diff shows a draft of how gas limit adjustments will be implemented. The config parameters added in the current PR are constants in gas_limit_adjustment.rs in the diff.

Possible review approach

  • Look at GasLimitAdjustmentConfig and how it's integrated into RuntimeConfig.
  • Look at the test. It runs a test loop for a while and verifies gas limits are not changed.
  • Look at the rest, which is basically just plumbing to grab the environment's RuntimeConfig and pass its gas_limit_adjustment_config through.

Copy link

codecov bot commented Jul 31, 2024

Codecov Report

Attention: Patch coverage is 76.31579% with 9 lines in your changes missing coverage. Please review.

Project coverage is 71.48%. Comparing base (309cbb7) to head (52c4682).

Files Patch % Lines
tools/replay-archive/src/cli.rs 0.00% 3 Missing ⚠️
chain/chain/src/validate.rs 85.71% 2 Missing ⚠️
chain/chain/src/chain.rs 66.66% 0 Missing and 1 partial ⚠️
...chain/src/stateless_validation/chunk_validation.rs 66.66% 0 Missing and 1 partial ⚠️
...nt/src/stateless_validation/chunk_validator/mod.rs 75.00% 0 Missing and 1 partial ⚠️
...me-params-estimator/src/costs_to_runtime_config.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #11863      +/-   ##
==========================================
- Coverage   71.49%   71.48%   -0.02%     
==========================================
  Files         812      812              
  Lines      163652   163687      +35     
  Branches   163652   163687      +35     
==========================================
+ Hits       117010   117017       +7     
- Misses      41606    41625      +19     
- Partials     5036     5045       +9     
Flag Coverage Δ
backward-compatibility 0.23% <0.00%> (-0.01%) ⬇️
db-migration 0.23% <0.00%> (-0.01%) ⬇️
genesis-check 1.34% <0.00%> (-0.01%) ⬇️
integration-tests 38.38% <60.52%> (+<0.01%) ⬆️
linux 71.25% <76.31%> (+<0.01%) ⬆️
linux-nightly 71.04% <76.31%> (-0.02%) ⬇️
macos 54.36% <72.97%> (+0.01%) ⬆️
pytests 1.60% <0.00%> (-0.01%) ⬇️
sanity-checks 1.40% <0.00%> (-0.01%) ⬇️
unittests 65.76% <71.05%> (-0.01%) ⬇️
upgradability 0.28% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mooori mooori changed the title feat(benchmarknet): add config for dynamic gas_limit adjustment feat(config): add config for dynamic gas_limit adjustment Aug 14, 2024
@mooori mooori marked this pull request as ready for review August 14, 2024 14:08
@mooori mooori requested a review from a team as a code owner August 14, 2024 14:08
/// Override for `GAS_LIMIT_ADJUSTMENT_FACTOR`.
///
/// The default adjustment factor might not be effective enough.
pub adjustment_factor_override: u64,
Copy link
Contributor

@aborg-dev aborg-dev Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need an "_override" suffix here, it doesn't bring any more clarity and is inconsistent with other fields we use to override config values that have defaults. I think this name would make more sense if this field was of type Option and it was applied conditionally.

/// The default adjustment factor might not be effective enough.
pub adjustment_factor_override: u64,
/// The maximum chunk apply time considered sustainable, in milliseconds.
pub max_chunk_apply_time: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and 2 other time values below are unlikely to become the subject to consensus rules as they are non-deterministic and hard to enforce. So they could possibly be a part of the node config - e.g. it makes sense for them to be changed during operation without rebuilding the neard.
I do realize though that this could be tricky to wire these values from there to the place where they are used. So unless @Longarithm and @robin-near have any suggestions, we can leave them here and add a comment about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern about moving some parameters into node config would be that IMO with configuration split across RuntimeConfig and ClientConfig it becomes significantly harder to reason about and maintain dynamic gas limit adjustments. Anyway, this draft gives an idea of the wiring needed to put it into ClientConfig.

For now this feature is supposed to be used only in tests and benchmarks:

  • Testing usually involves compilation, so having everything in RuntimeConfig might be fine for this use case.
  • I agree that moving some of the parameters into node config could facilitate running benchmarks. Though for a given setup I would expect the time parameters to be relatively constant. Then we might be fine with having them in RuntimeConfig and ocassionally adjust them in the benchmark setup, e.g. here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Runtime config is a great place for adjustment_factor_override; however, _time parameters indeed doesn't define consensus rules. I think there is a compromise. NightshadeRuntime should be stateless, so introducing set of new parameters is messy. But we can move these parameters to Chain. The sequence of passing parameters could look as follows:

RuntimeConfig will have just a single adjustment_factor_override: u64
GasLimitAdjustmentConfig will have time-based parameters

  • Client will take GasLimitAdjustmentConfig from ClientConfig
  • -> ChainConfig in Chain::new
  • -> Chain::gas_limit_adjustment_config field
  • -> process_shard_update with new argument gas_limit_adjustment_config. That's probably fine because only the job created there will know the apply chunk time, based on which adjustment will happen
  • -> apply_new_chunk with new argument
  • -> NewChunkResult

I don't see it as completely necessary, because this is benchmark-only. However, doing it in client config may allow to use it on real situation if it ever comes to that; every node could configure its own adjustment parameters.

Copy link
Contributor

@aborg-dev aborg-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, but we need a review from @Longarithm and @robin-near here.

Comment on lines +190 to +192
let adjustment_factor = gas_limit_adjustment_factor(gas_limit_adjustment_config);
if chunk_header.gas_limit() < gas_limit - gas_limit / adjustment_factor
|| chunk_header.gas_limit() > gas_limit + gas_limit / adjustment_factor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the right place to perform this check. The chunk header should contain info from the prev chunk extra copied byte by byte. The adjustment should be deterministically computed as part of chunk application (be it in runtime or chain). An unchecked advise from me would be for you to grep for usages of ChunkExtra::new and/or ApplyChunkResult. You'll need to make the new gas limit part of the Result and propagate it into the Extra.

  • In the PR title you say that you want the adjustment to be dynamic. This means that you want to allow the chain to automatically adjust the gas limit based on some live data like gas usage.
  • In the PR summary you say that you need it for testing and benchmarking. In this case it seems like it should be sufficient to just have a static config for this value. This would greatly simplify the logic and you wouldn't need to go into chain, validation, etc.

Which of the above are you aiming for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the right place to perform this check.

Can this be considered a preexisting issue? Since that's where the check currently is. This PR doesn't add or modify the check, it "just" adds configuration which allows overriding the hardcoded GAS_LIMIT_ADJUSTMENT_FACTOR.

Which of the above are you aiming for?

My understanding is that it's both. For now the main use-case is starting a node for a continuous benchmark and let it automatically reach the highest gas limit that it can handle. Since one of the aims of these benchmarks is to uncover what could be possible in terms of transactions per second. More details are given in #11460.

for testing and benchmarking. In this case it seems like it should be sufficient to just have a static config for this value.

Yes, this can be done by modifying gas_limit in genesis.json or setting another INITIAL_GAS_LIMIT. Though it's tedious to determine the right value. Also this value depends on the hardware that is used and how nearcore performance characteristics change. Hence the dynamic gas limit adjustments which aim to make the node determine the "ideal" gas limit automatically.

An unchecked advise from me would be for you to grep for usages of ChunkExtra::new and/or ApplyChunkResult. You'll need to make the new gas limit part of the Result and propagate it into the Extra.

Thanks! I'll look into it as I'm trying to find out if there is another way to modify the gas limit which will not make this test fail:

if prev_chunk_extra.gas_used() != chunk_header.prev_gas_used() {
return Err(Error::InvalidGasUsed);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wacban this looks like the right place to me. Unlike congestion control, this adjustment is meant to be subjective for every node, that's even its original intent. So we reach both goals here.

The reason is - if gas limit becomes not enough, nodes can nudge it to the more convenient value. If nodes do that, it's impossible to record in chunk extra for prev chunk, final gas limit can be known only when chunk is received. Exact gas limit changes were never coded before, though. Thus #11958 is correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Alex!

Yeah I stand corrected it may be ok to do it here. I suppose the idea is to leave it to the chunk producer to adjust the limit as they see fit. Given this is only meant for testing it seems fine.

My only concern with regard to this implementation is that it will introduce divergence between the info in ChunkExtra and the ChunkHeader of the next block. I would suggest going through the usages of the gas limit in the codebase and make sure it's always obtained from the header and not from the extra. Alternatively you can consider updating the extra after the header is accepted.

TBH I'm still on the edge about making this purely testing change a part of the protocol - even if disabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Longarithm, @wacban thanks for the suggestions! I can look into it when I'm back next week.

TBH I'm still on the edge about making this purely testing change a part of the protocol - even if disabled.

What are the next steps in making that decision? As I should avoid working on something that would be obsolete.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mooori Can you check if there are any sensible out of protocol alternatives? Perhaps the same goal can be achieved by the testing framework or by controlling the traffic?

@mooori
Copy link
Contributor Author

mooori commented Aug 26, 2024

Following discussion with @akhi3030, we are putting the project of dynamically adjusting the gas_limit on hold: The review has shown that landing this feature requires far more work/changes than anticipated. Given that (for now) it would be used only in benchmarks, other projects that remove bottlenecks and improve performance have currently higher priority.

Thanks to the reviewers for their feedback!

@mooori mooori closed this Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants