-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: introduce experimental packaging format #278
Conversation
5e56ab7
to
7114e7b
Compare
89e9169
to
c21bf07
Compare
@greenhat This PR ended up turning into its own thing, so rather than using it for fixing the get_inputs test, I've renamed it and will start a new branch for get_inputs instead. You can ignore the original PR description, the title and commit messages cover the gist of it instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great!
@@ -14,16 +15,31 @@ use crate::{ | |||
Session, | |||
}; | |||
|
|||
static STDLIB: LazyLock<StdLibrary> = LazyLock::new(StdLibrary::default); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoa! Good catch! We loaded the libs on every compilation. 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran the test suite and it was suddenly taking ages to run, and I could not figure out what I had changed to make it happen. Turned out it was basically adding the stdlib to the link libraries set of every single module being compiled 😅. There are some changes in miden-assembly
/miden-core
that should make this way less of an issue, but it turns out that the overhead of hashing is really expensive!
Previous to this commit, we were expecting the rodata segments to be encoded in a specific order, and placed on the advice stack before anything else. With this commit, we now simply expect to the advice map to contain each segment keyed by its commitment hash, which we then move to the advice stack on demand, and immediately pipe to memory. This means the order of the segments no longer matters, and the advice stack is not sensitive to codegen changes or other influences which might perturb the advice stack or otherwise disrupt our assumptions. It also sets the stage for us to be able to initialize rodata after a context switch, as at that point the advice stack will be in an unknown condition, and using the advice map gives us certainty that we can arrange to have exactly what we need on the advice stack, when we need it. Additionally, I've updated the `midenc debug` input config file, as well as the usage documentation to reflect this. The last related change to this, will be emitting the rodata segments to disk in a convenient form, so that when the compiler emits the program, it also emits the segments alongside it, making it convenient to run the debugger against that program (or via the VM directly).
This commit implements a basic package format for Miden programs and libraries, with the metadata needed to simplify using them together. The format is experimental, and can be changed at any time, however it does use a header that lets us identify whether a package was produced with the same specification or not. This also modifies the compiler and debugger to emit and consume, respectively, the package format as their primary artifact. The debugger uses this, for example, to initialize the VM with the necessary libraries and rodata segments required by the program. The package format is described using the structures in `codegen/masm/src/packaging/package.rs`, and is encoded using a tiny, but efficient binary format from the `bitcode` crate, using `serde` to facilitate the lowering of our high-level types to that format. This made it relatively straightforward to implement the format without having to mess around with the low-level byte representation. The `bitcode` crate is designed with stability in mind, though because we are using it via `serde`, the stability is subject to being violated if `serde` changes something that causes `bitcode` to serialize a type differently. For now, this is a non-issue, longer term we might want to actually implement `bitcode::{Encode, Decode}` for our types if we want to gain more stability.
Some recent changes caused the `midenc compile` command to emit no outputs when `--emit` was not provided. Also fixed: * Moved some unnecessary debug logs to trace level * Added more debug logging to the compiler to aid in troubleshooting * Fixed serialization of packages involving more complex inputs which were hitting an issue with fields marked as skippable via serde; this is not supported via bitcode, so those attributes were removed
It turns out we were assuming that Rust was laying out its memory using the default LLVM layout, but rustc actually specifies a larger shadow stack. We were not properly handling the defined memory in Wasm modules parsed by the frontend, and so we ended up incorrectly placing global variables and other data in the middle of memory that rust was using for either the shadow stack or static data. Because of the size of the shadow stack region reserved by Rust, this was not causing any corruption in our tests, but it was causing other assumptions to be invalid, and miscompilations were the result. This commit treats the Wasm memory for a parsed module as reserved for use by the code in that module, which causes midenc to place any global variables it defines after that region, and computes the start of the heap as immediately following any such data. It does so by propagating the page size and reserved number of pages through the IR and to the backend, taking that information into account as needed.
4467c16
to
ce99ba4
Compare
This PR is my work-in-progress branch while I work through fixing the last remaining blocker for the alpha release, an unknown bug related to linear memory.
In the process of diagnosing the bug, I'm finding various things which I'm addressing in the form of independent commits. For anything that is significant enough that we want it on
next
ASAP, I'll open those as separate PRs, but for now, what is in this branch is mostly modifications/improvements to various parts of the compiler and debugger.@greenhat I'll ping you when this is ready for review, but feel free to scan through it for anything of interest.