Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark and optimize analyze call #2033

Closed
2 tasks done
webmaster128 opened this issue Feb 27, 2024 · 3 comments · Fixed by #2051
Closed
2 tasks done

Benchmark and optimize analyze call #2033

webmaster128 opened this issue Feb 27, 2024 · 3 comments · Fixed by #2051
Assignees
Milestone

Comments

@webmaster128
Copy link
Member

webmaster128 commented Feb 27, 2024

During manual testing I realized that analyze can be a bit slow, depending on the contract. This might be due to a heavy ParsedWasm::parse or maybe even due to loading the whole Wasm from disk at once.

In CosmWasm/wasmd#1813 we start exploring first steps towasds mass migrations. In order to ensure MsgMigrateContract is performant (dozens, maybe hundreds of migrations in a block), we also need to ensure that the Analyze call is fast.

  • Create benchmark for analyze for different contracts
  • Evaluate and implement performance optimizations
@aumetra
Copy link
Member

aumetra commented Mar 12, 2024

After adding some more of the testdata contracts to the benchmark harness and running it through cargo-flamegraph, it seems like it spends most if its time inside wasmparser::validator::func::FuncValidator.

In there it steps through the function body byte-by-byte and validates that it's well-formed. Makes sense that we have that step. Since that is part of Bytecode Alliance's wasmparser, I'm not sure how much we could actually optimize this on our side.


Some bench examples when removing the validation of the function body:

Cache/analyze_cyberpunk.wasm
                        time:   [168.58 µs 168.66 µs 168.80 µs]
                        change: [-83.658% -83.618% -83.580%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 12 measurements (8.33%)
  1 (8.33%) high mild
Cache/analyze_cyberpunk_rust170.wasm
                        time:   [157.20 µs 157.61 µs 158.03 µs]
                        change: [-79.398% -79.357% -79.303%] (p = 0.00 < 0.05)
                        Performance has improved.
Cache/analyze_floaty.wasm
                        time:   [117.18 µs 117.26 µs 117.37 µs]
                        change: [-82.821% -82.791% -82.763%] (p = 0.00 < 0.05)
                        Performance has improved.
Cache/analyze_hackatom.wasm
                        time:   [124.91 µs 125.00 µs 125.11 µs]
                        change: [-83.251% -83.200% -83.159%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 12 measurements (8.33%)
  1 (8.33%) high mild

As apparent from the benches, we see an ~80% improvement in performance across the board and are comfortably entering the low three-digit microsecond territory.

@webmaster128
Copy link
Member Author

In there it steps through the function body byte-by-byte and validates that it's well-formed. Makes sense that we have that step. Since that is part of Bytecode Alliance's wasmparser, I'm not sure how much we could actually optimize this on our side.

Something we can do to improve this is pull out the function validation into a separate step. Then

  1. In fn check_wasm we stop the execution on error (if any) early and only perform the function validation if everything else is ok
  2. In fn analyze we don't need to do the function validation at all since the Wasm file was checked before and we don't use that extra data

@aumetra
Copy link
Member

aumetra commented Mar 12, 2024

Makes sense, will look into that.

I also managed to get some performance improvements (averaging at ~22%) by parallelising the function body validation (without reusing the allocations from the validator since that would add locking overhead)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants