-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KCL: New simulation test pipeline #4351
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
QA Wolf here! As you write new code it's important that your test coverage is keeping up. |
6fefa15
to
c88d318
Compare
e65cf35
to
c88d318
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4351 +/- ##
==========================================
+ Coverage 85.86% 85.92% +0.05%
==========================================
Files 77 78 +1
Lines 27434 27552 +118
==========================================
+ Hits 23557 23674 +117
- Misses 3877 3878 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There's actually a way to control generated UUIDs since #4101, but I like your way better for this. I guess technically there could be a bug in swapping one UUID for another, but I don't think we should worry about it right now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. I like the generator instead of macros. In the future, I'll probably want to replace program memory with the artifact graph. But this is good for where we are now.
Yeah but right now the engine is also returning random IDs, so we'd still need the [uuid] filter for engine responses. The goal is to get IDs in both requests and responses to be deterministic, then I can remove the filter. |
89e0647
to
7a2db76
Compare
The idea behind this is to test all the various stages of executing KCL separately, i.e. - Start with a program - Tokenize it - Parse those tokens into an AST - Recast the AST - Execute the AST, outputting - a PNG of the rendered model - serialized program memory Each of these steps reads some input and writes some output to disk. The output of one step becomes the input to the next step. These intermediate artifacts are also snapshotted (like expectorate or 2020) to ensure we're aware of any changes to how KCL works. A change could be a bug, or it could be harmless, or deliberate, but keeping it checked into the repo means we can easily track changes. Note: UUIDs sent back by the engine are currently nondeterministic, so they would break all the snapshot tests. So, the snapshots use a regex filter and replace anything that looks like a uuid with [uuid] when writing program memory to a snapshot. In the future I hope our UUIDs will be seedable and easy to make deterministic. At that point, we can stop filtering the UUIDs. We run this pipeline on many different KCL programs. Each keeps its inputs (KCL programs), outputs (PNG, program memory snapshot) and intermediate artifacts (AST, token lists, etc) in that directory. I also added a new `just` command to easily generate these tests. You can run `just new-sim-test gear $(cat gear.kcl)` to set up a new gear test directory and generate all the intermediate artifacts for the first time. This doesn't need any macros, it just appends some new lines of normal Rust source code to `tests.rs`, so it's easy to see exactly what the code is doing. This uses `cargo insta` for convenient snapshot testing of artifacts as JSON, and 20/20 for snapshotting PNGs. This was heavily inspired by Predrag Gruevski's talk at EuroRust 2024 about deterministic simulation testing, and how it can both reduce bugs and also reduce testing/CI time. Very grateful to him for chatting with me about this over the last couple of weeks.
7a2db76
to
e418add
Compare
The idea behind this is to test all the various stages of executing KCL in separately, i.e.
Each of these steps reads some input and writes some output to disk. The output of one step becomes the input to the next step. These intermediate artifacts are also snapshotted (like expectorate or 2020) to ensure we're aware of any changes to how KCL works. A change could be a bug, or it could be harmless, or deliberate, but keeping it checked into the repo means we can easily track changes.
If any step fails, we snapshot the error, and we don't run the subsequent steps in the pipeline. Each of these steps can be run in parallel (we still limit the number of engine connections that unit tests will run concurrently, but at least all the parser/tokenizer/recaster tests can be parallelized).
Note: UUIDs sent back by the engine are currently nondeterministic, so they would break all the snapshot tests. So, the snapshots use a regex filter and replace anything that looks like a uuid with [uuid] when writing program memory to a snapshot.
We run this pipeline on many different KCL programs. Each keeps its inputs (KCL programs), outputs (PNG, program memory snapshot) and intermediate artifacts (AST, token lists, etc) in that directory.
I also added a new
just
command to easily generate these tests. You can runjust new-sim-test gear $(cat gear.kcl)
to set up a new gear test directory and generate all the intermediate artifacts for the first time. This doesn't need any macros, it just appends some new lines of normal Rust source code totests.rs
, so it's easy to see exactly what the code is doing.This uses insta for convenient snapshot testing of artifacts as JSON, and our very own 2020 for snapshotting PNGs.
This was heavily inspired by @obi1kenobi's talk at EuroRust 2024 about deterministic simulation testing, and how it can both reduce bugs and also reduce testing/CI time. Very grateful to him for chatting with me about this over the last couple of weeks.