Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring interpreter and paranoid mode, introducing traits to allo… #15350

Merged

Conversation

ziaptos
Copy link
Contributor

@ziaptos ziaptos commented Nov 21, 2024

Description

Refactoring interpreter and paranoid mode, introducing traits to allow generic interpreter as well as compile time switch for the runtime type checks (formerly called paranoid mode)

How Has This Been Tested?

Existing tests

Key Areas to Review

Should focus on making sure that the high level logic has not changed!

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Move Compiler
  • Other (specify)

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

…w generic interpreter as well as compile time switch for the runtime type checks (formerly called paranoid mode)
Copy link

trunk-io bot commented Nov 21, 2024

⏱️ 1h 49m total CI duration on this PR
Job Cumulative Duration Recent Runs
forge-e2e-test / forge 14m 🟩
rust-move-tests 13m 🟩
rust-cargo-deny 13m 🟩🟩🟩🟩🟩 (+2 more)
rust-move-tests 13m 🟩
rust-move-tests 13m 🟩
rust-move-tests 12m 🟩
rust-move-tests 10m
check-dynamic-deps 8m 🟩🟩🟩🟩🟩 (+2 more)
general-lints 3m 🟩🟩🟩🟩🟩 (+2 more)
semgrep/ci 3m 🟩🟩🟩🟩🟩 (+2 more)
rust-move-tests 3m 🟥
rust-move-tests 3m 🟥
file_change_determinator 1m 🟩🟩🟩🟩🟩 (+2 more)
permission-check 21s 🟩🟩🟩🟩🟩 (+2 more)
permission-check 21s 🟩🟩🟩🟩🟩 (+2 more)

settingsfeedbackdocs ⋅ learn more about trunk.io

third_party/move/move-vm/runtime/src/tracing.rs Outdated Show resolved Hide resolved
third_party/move/move-vm/runtime/src/frame_type_cache.rs Outdated Show resolved Hide resolved
/// ((Type of the field, size of the field type) and (Type of its defining struct, size of its defining struct)
field_instantiation:
BTreeMap<FieldInstantiationIndex, ((Type, NumTypeNodes), (Type, NumTypeNodes))>,
/// Same as above, bot for variant field instantiations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: bot --> but?

operand_stack.push_ty(output_ty)
}

pub(crate) struct NullRuntimeTypeCheck;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See some C++ here ahah, NoRuntimeTypeCheck maybe?


/// Paranoid type checks to perform after instruction execution.
///
/// This function and `pre_execution_type_stack_transition` should constitute the full type stack transition for the paranoid mode.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think lines are > 100 chars again?

| Bytecode::Call(_)
| Bytecode::CallGeneric(_)
| Bytecode::Abort => {
// Invariants hold because all of the instructions above will force VM to break from the interpreter loop and thus not hit this code path.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, multiple lines

) -> PartialVMResult<()>;
}

/// Paranoid type checks to perform before instruction execution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is meant for the module, you can do //! Your description here on top of the file? This also seems like a good documentation to add to trait definition?

.into_iter()
.zip(field_tys)
{
// Fields ability should be a subset of the struct ability because abilities can be weakened but not the other direction.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines too long

Copy link
Contributor

@georgemitenkov georgemitenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! Some minor comments from previous review are still pending, but those we can address as a follow-up if you prefer.

Copy link
Contributor

@vgao1996 vgao1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, a step toward the right direction.

I'm aware that the PR mostly moves things around, but given the amount of critical code you've touched, I'd recommend us being cautious and run replay before landing the PR.

@ziaptos ziaptos added CICD:test-replay Trigger a testnet replay-verify job for this PR CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR labels Nov 28, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@ziaptos ziaptos enabled auto-merge (squash) November 28, 2024 22:27

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 8b1d4bea01ca0c9a2b4ecff116e91ac271fdca78

two traffics test: inner traffic : committed: 14165.81 txn/s, latency: 2806.10 ms, (p50: 2700 ms, p70: 2700, p90: 3000 ms, p99: 5100 ms), latency samples: 5386160
two traffics test : committed: 99.96 txn/s, latency: 1980.40 ms, (p50: 1600 ms, p70: 2000, p90: 2200 ms, p99: 10500 ms), latency samples: 1780
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 2.192, avg: 1.248", "ConsensusProposalToOrdered: max: 0.318, avg: 0.291", "ConsensusOrderedToCommit: max: 0.378, avg: 0.369", "ConsensusProposalToCommit: max: 0.672, avg: 0.660"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.81s no progress at version 2255917 (avg 0.20s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 15.61s no progress at version 2254564 (avg 15.61s) [limit 16].
Test Ok

Copy link
Contributor

✅ Forge suite framework_upgrade success on 010570d3b7aa20889fb5ad0e5b23800aa33f5634 ==> 8b1d4bea01ca0c9a2b4ecff116e91ac271fdca78

Compatibility test results for 010570d3b7aa20889fb5ad0e5b23800aa33f5634 ==> 8b1d4bea01ca0c9a2b4ecff116e91ac271fdca78 (PR)
Upgrade the nodes to version: 8b1d4bea01ca0c9a2b4ecff116e91ac271fdca78
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1431.33 txn/s, submitted: 1435.51 txn/s, failed submission: 3.97 txn/s, expired: 4.18 txn/s, latency: 2134.32 ms, (p50: 1800 ms, p70: 2100, p90: 3400 ms, p99: 4800 ms), latency samples: 129861
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1232.73 txn/s, submitted: 1236.25 txn/s, failed submission: 3.52 txn/s, expired: 3.52 txn/s, latency: 2290.87 ms, (p50: 2100 ms, p70: 2400, p90: 3600 ms, p99: 5600 ms), latency samples: 112020
5. check swarm health
Compatibility test for 010570d3b7aa20889fb5ad0e5b23800aa33f5634 ==> 8b1d4bea01ca0c9a2b4ecff116e91ac271fdca78 passed
Upgrade the remaining nodes to version: 8b1d4bea01ca0c9a2b4ecff116e91ac271fdca78
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 1552.60 txn/s, submitted: 1556.16 txn/s, failed submission: 3.57 txn/s, expired: 3.57 txn/s, latency: 2046.93 ms, (p50: 2100 ms, p70: 2100, p90: 2700 ms, p99: 4200 ms), latency samples: 130640
Test Ok

Copy link
Contributor

✅ Forge suite compat success on 010570d3b7aa20889fb5ad0e5b23800aa33f5634 ==> 8b1d4bea01ca0c9a2b4ecff116e91ac271fdca78

Compatibility test results for 010570d3b7aa20889fb5ad0e5b23800aa33f5634 ==> 8b1d4bea01ca0c9a2b4ecff116e91ac271fdca78 (PR)
1. Check liveness of validators at old version: 010570d3b7aa20889fb5ad0e5b23800aa33f5634
compatibility::simple-validator-upgrade::liveness-check : committed: 13767.94 txn/s, latency: 2101.56 ms, (p50: 1900 ms, p70: 2100, p90: 2700 ms, p99: 5100 ms), latency samples: 529220
2. Upgrading first Validator to new version: 8b1d4bea01ca0c9a2b4ecff116e91ac271fdca78
compatibility::simple-validator-upgrade::single-validator-upgrading : committed: 7084.11 txn/s, latency: 3907.77 ms, (p50: 4500 ms, p70: 4600, p90: 5100 ms, p99: 5300 ms), latency samples: 127560
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 7391.83 txn/s, latency: 4368.20 ms, (p50: 4600 ms, p70: 4700, p90: 6200 ms, p99: 6700 ms), latency samples: 243220
3. Upgrading rest of first batch to new version: 8b1d4bea01ca0c9a2b4ecff116e91ac271fdca78
compatibility::simple-validator-upgrade::half-validator-upgrading : committed: 7354.41 txn/s, latency: 3884.66 ms, (p50: 4400 ms, p70: 4600, p90: 4700 ms, p99: 4800 ms), latency samples: 137900
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 7539.91 txn/s, latency: 4304.26 ms, (p50: 4700 ms, p70: 4800, p90: 4800 ms, p99: 4900 ms), latency samples: 247920
4. upgrading second batch to new version: 8b1d4bea01ca0c9a2b4ecff116e91ac271fdca78
compatibility::simple-validator-upgrade::rest-validator-upgrading : committed: 10278.14 txn/s, latency: 2671.24 ms, (p50: 2200 ms, p70: 3200, p90: 4300 ms, p99: 5800 ms), latency samples: 191280
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 10447.37 txn/s, latency: 2942.26 ms, (p50: 2100 ms, p70: 3100, p90: 5600 ms, p99: 8200 ms), latency samples: 341800
5. check swarm health
Compatibility test for 010570d3b7aa20889fb5ad0e5b23800aa33f5634 ==> 8b1d4bea01ca0c9a2b4ecff116e91ac271fdca78 passed
Test Ok

@ziaptos ziaptos merged commit 46bd0e9 into main Nov 28, 2024
191 checks passed
@ziaptos ziaptos deleted the zi/new-type/refactoring-paranoid-mode-to-generic-rtt-checks branch November 28, 2024 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-e2e-tests when this label is present github actions will run all land-blocking e2e tests from the PR CICD:test-replay Trigger a testnet replay-verify job for this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants