KCL: New simulation test pipeline #4351

adamchalmers · 2024-10-29T20:54:42Z

The idea behind this is to test all the various stages of executing KCL in separately, i.e.

Start with a program
Tokenize it
Parse those tokens into an AST
Recast the AST
Execute the AST, outputting
- a PNG of the rendered model
- serialized program memory

Each of these steps reads some input and writes some output to disk. The output of one step becomes the input to the next step. These intermediate artifacts are also snapshotted (like expectorate or 2020) to ensure we're aware of any changes to how KCL works. A change could be a bug, or it could be harmless, or deliberate, but keeping it checked into the repo means we can easily track changes.

If any step fails, we snapshot the error, and we don't run the subsequent steps in the pipeline. Each of these steps can be run in parallel (we still limit the number of engine connections that unit tests will run concurrently, but at least all the parser/tokenizer/recaster tests can be parallelized).

Note: UUIDs sent back by the engine are currently nondeterministic, so they would break all the snapshot tests. So, the snapshots use a regex filter and replace anything that looks like a uuid with [uuid] when writing program memory to a snapshot.

We run this pipeline on many different KCL programs. Each keeps its inputs (KCL programs), outputs (PNG, program memory snapshot) and intermediate artifacts (AST, token lists, etc) in that directory.

I also added a new just command to easily generate these tests. You can run just new-sim-test gear $(cat gear.kcl) to set up a new gear test directory and generate all the intermediate artifacts for the first time. This doesn't need any macros, it just appends some new lines of normal Rust source code to tests.rs, so it's easy to see exactly what the code is doing.

This uses insta for convenient snapshot testing of artifacts as JSON, and our very own 2020 for snapshotting PNGs.

This was heavily inspired by @obi1kenobi's talk at EuroRust 2024 about deterministic simulation testing, and how it can both reduce bugs and also reduce testing/CI time. Very grateful to him for chatting with me about this over the last couple of weeks.

vercel · 2024-10-29T20:54:45Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated (UTC)
modeling-app	✅ Ready (Inspect)	Visit Preview	Oct 30, 2024 4:01pm

qa-wolf · 2024-10-29T20:54:45Z

QA Wolf here! As you write new code it's important that your test coverage is keeping up.
Click here to request test coverage for this PR!

codecov · 2024-10-29T21:30:55Z

Codecov Report

Attention: Patch coverage is 90.76923% with 12 lines in your changes missing coverage. Please review.

Project coverage is 85.92%. Comparing base (0d52851) to head (e418add).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/wasm-lib/kcl/src/simulation_tests.rs	91.66%	8 Missing ⚠️
src/wasm-lib/kcl/src/executor.rs	66.66%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4351      +/-   ##
==========================================
+ Coverage   85.86%   85.92%   +0.05%     
==========================================
  Files          77       78       +1     
  Lines       27434    27552     +118     
==========================================
+ Hits        23557    23674     +117     
- Misses       3877     3878       +1

Flag	Coverage Δ
wasm-lib	`85.92% <90.76%> (+0.05%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jtran · 2024-10-30T02:39:38Z

UUIDs sent back by the engine are currently nondeterministic, so they would break all the snapshot tests.

There's actually a way to control generated UUIDs since #4101, but I like your way better for this. I guess technically there could be a bug in swapping one UUID for another, but I don't think we should worry about it right now.

jtran

This looks great. I like the generator instead of macros. In the future, I'll probably want to replace program memory with the artifact graph. But this is good for where we are now.

adamchalmers · 2024-10-30T15:34:02Z

UUIDs sent back by the engine are currently nondeterministic, so they would break all the snapshot tests.

There's actually a way to control generated UUIDs since #4101, but I like your way better for this. I guess technically there could be a bug in swapping one UUID for another, but I don't think we should worry about it right now.

Yeah but right now the engine is also returning random IDs, so we'd still need the [uuid] filter for engine responses. The goal is to get IDs in both requests and responses to be deterministic, then I can remove the filter.

The idea behind this is to test all the various stages of executing KCL separately, i.e. - Start with a program - Tokenize it - Parse those tokens into an AST - Recast the AST - Execute the AST, outputting - a PNG of the rendered model - serialized program memory Each of these steps reads some input and writes some output to disk. The output of one step becomes the input to the next step. These intermediate artifacts are also snapshotted (like expectorate or 2020) to ensure we're aware of any changes to how KCL works. A change could be a bug, or it could be harmless, or deliberate, but keeping it checked into the repo means we can easily track changes. Note: UUIDs sent back by the engine are currently nondeterministic, so they would break all the snapshot tests. So, the snapshots use a regex filter and replace anything that looks like a uuid with [uuid] when writing program memory to a snapshot. In the future I hope our UUIDs will be seedable and easy to make deterministic. At that point, we can stop filtering the UUIDs. We run this pipeline on many different KCL programs. Each keeps its inputs (KCL programs), outputs (PNG, program memory snapshot) and intermediate artifacts (AST, token lists, etc) in that directory. I also added a new `just` command to easily generate these tests. You can run `just new-sim-test gear $(cat gear.kcl)` to set up a new gear test directory and generate all the intermediate artifacts for the first time. This doesn't need any macros, it just appends some new lines of normal Rust source code to `tests.rs`, so it's easy to see exactly what the code is doing. This uses `cargo insta` for convenient snapshot testing of artifacts as JSON, and 20/20 for snapshotting PNGs. This was heavily inspired by Predrag Gruevski's talk at EuroRust 2024 about deterministic simulation testing, and how it can both reduce bugs and also reduce testing/CI time. Very grateful to him for chatting with me about this over the last couple of weeks.

adamchalmers requested a review from jtran October 29, 2024 20:54

adamchalmers force-pushed the achalmers/detsim branch 2 times, most recently from 6fefa15 to c88d318 Compare October 29, 2024 20:58

adamchalmers requested review from paultag and jessfraz October 29, 2024 21:02

vercel bot deployed to Preview October 29, 2024 21:14 View deployment

vercel bot deployed to Preview October 29, 2024 21:29 View deployment

adamchalmers force-pushed the achalmers/detsim branch from e65cf35 to c88d318 Compare October 29, 2024 21:30

jtran approved these changes Oct 30, 2024

View reviewed changes

adamchalmers force-pushed the achalmers/detsim branch 2 times, most recently from 89e0647 to 7a2db76 Compare October 30, 2024 15:43

adamchalmers force-pushed the achalmers/detsim branch from 7a2db76 to e418add Compare October 30, 2024 15:43

vercel bot deployed to Preview October 30, 2024 16:01 View deployment

adamchalmers merged commit 0c6c646 into main Oct 30, 2024
32 checks passed

adamchalmers deleted the achalmers/detsim branch October 30, 2024 17:14

franknoirot mentioned this pull request Nov 7, 2024

Cut release v0.26.3 #4427

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KCL: New simulation test pipeline #4351

KCL: New simulation test pipeline #4351

adamchalmers commented Oct 29, 2024 •

edited

Loading

vercel bot commented Oct 29, 2024 •

edited

Loading

qa-wolf bot commented Oct 29, 2024

codecov bot commented Oct 29, 2024 •

edited

Loading

jtran commented Oct 30, 2024

jtran left a comment

adamchalmers commented Oct 30, 2024

KCL: New simulation test pipeline #4351

KCL: New simulation test pipeline #4351

Conversation

adamchalmers commented Oct 29, 2024 • edited Loading

vercel bot commented Oct 29, 2024 • edited Loading

qa-wolf bot commented Oct 29, 2024

codecov bot commented Oct 29, 2024 • edited Loading

Codecov Report

jtran commented Oct 30, 2024

jtran left a comment

Choose a reason for hiding this comment

adamchalmers commented Oct 30, 2024

adamchalmers commented Oct 29, 2024 •

edited

Loading

vercel bot commented Oct 29, 2024 •

edited

Loading

codecov bot commented Oct 29, 2024 •

edited

Loading