feat: introduce experimental packaging format #278

bitwalker · 2024-08-16T07:07:36Z

This PR is my work-in-progress branch while I work through fixing the last remaining blocker for the alpha release, an unknown bug related to linear memory.

In the process of diagnosing the bug, I'm finding various things which I'm addressing in the form of independent commits. For anything that is significant enough that we want it on next ASAP, I'll open those as separate PRs, but for now, what is in this branch is mostly modifications/improvements to various parts of the compiler and debugger.

@greenhat I'll ping you when this is ready for review, but feel free to scan through it for anything of interest.

bitwalker · 2024-08-19T04:42:03Z

@greenhat This PR ended up turning into its own thing, so rather than using it for fixing the get_inputs test, I've renamed it and will start a new branch for get_inputs instead. You can ignore the original PR description, the title and commit messages cover the gist of it instead.

greenhat

Looking great!

greenhat · 2024-08-26T09:25:16Z

midenc-session/src/libs.rs

@@ -14,16 +15,31 @@ use crate::{
    Session,
 };

+static STDLIB: LazyLock<StdLibrary> = LazyLock::new(StdLibrary::default);


Whoa! Good catch! We loaded the libs on every compilation. 🤦

I ran the test suite and it was suddenly taking ages to run, and I could not figure out what I had changed to make it happen. Turned out it was basically adding the stdlib to the link libraries set of every single module being compiled 😅. There are some changes in miden-assembly/miden-core that should make this way less of an issue, but it turns out that the overhead of hashing is really expensive!

Previous to this commit, we were expecting the rodata segments to be encoded in a specific order, and placed on the advice stack before anything else. With this commit, we now simply expect to the advice map to contain each segment keyed by its commitment hash, which we then move to the advice stack on demand, and immediately pipe to memory. This means the order of the segments no longer matters, and the advice stack is not sensitive to codegen changes or other influences which might perturb the advice stack or otherwise disrupt our assumptions. It also sets the stage for us to be able to initialize rodata after a context switch, as at that point the advice stack will be in an unknown condition, and using the advice map gives us certainty that we can arrange to have exactly what we need on the advice stack, when we need it. Additionally, I've updated the `midenc debug` input config file, as well as the usage documentation to reflect this. The last related change to this, will be emitting the rodata segments to disk in a convenient form, so that when the compiler emits the program, it also emits the segments alongside it, making it convenient to run the debugger against that program (or via the VM directly).

This commit implements a basic package format for Miden programs and libraries, with the metadata needed to simplify using them together. The format is experimental, and can be changed at any time, however it does use a header that lets us identify whether a package was produced with the same specification or not. This also modifies the compiler and debugger to emit and consume, respectively, the package format as their primary artifact. The debugger uses this, for example, to initialize the VM with the necessary libraries and rodata segments required by the program. The package format is described using the structures in `codegen/masm/src/packaging/package.rs`, and is encoded using a tiny, but efficient binary format from the `bitcode` crate, using `serde` to facilitate the lowering of our high-level types to that format. This made it relatively straightforward to implement the format without having to mess around with the low-level byte representation. The `bitcode` crate is designed with stability in mind, though because we are using it via `serde`, the stability is subject to being violated if `serde` changes something that causes `bitcode` to serialize a type differently. For now, this is a non-issue, longer term we might want to actually implement `bitcode::{Encode, Decode}` for our types if we want to gain more stability.

Some recent changes caused the `midenc compile` command to emit no outputs when `--emit` was not provided. Also fixed: * Moved some unnecessary debug logs to trace level * Added more debug logging to the compiler to aid in troubleshooting * Fixed serialization of packages involving more complex inputs which were hitting an issue with fields marked as skippable via serde; this is not supported via bitcode, so those attributes were removed

…ompatible

It turns out we were assuming that Rust was laying out its memory using the default LLVM layout, but rustc actually specifies a larger shadow stack. We were not properly handling the defined memory in Wasm modules parsed by the frontend, and so we ended up incorrectly placing global variables and other data in the middle of memory that rust was using for either the shadow stack or static data. Because of the size of the shadow stack region reserved by Rust, this was not causing any corruption in our tests, but it was causing other assumptions to be invalid, and miscompilations were the result. This commit treats the Wasm memory for a parsed module as reserved for use by the code in that module, which causes midenc to place any global variables it defines after that region, and computes the start of the heap as immediately following any such data. It does so by propagating the page size and reserved number of pages through the IR and to the backend, taking that information into account as needed.

bitwalker added the blocker This issue is one of our top priorities label Aug 16, 2024

bitwalker added this to the Alpha milestone Aug 16, 2024

bitwalker requested a review from greenhat August 16, 2024 07:07

bitwalker self-assigned this Aug 16, 2024

bitwalker force-pushed the bitwalker/memory-fixes branch from 5e56ab7 to 7114e7b Compare August 16, 2024 07:21

bitwalker changed the title ~~fix: blockers for alpha release~~ feat: introduce experimental packaging format Aug 17, 2024

bitwalker force-pushed the bitwalker/memory-fixes branch 2 times, most recently from 89e9169 to c21bf07 Compare August 19, 2024 02:34

bitwalker marked this pull request as ready for review August 19, 2024 04:38

greenhat approved these changes Aug 26, 2024

View reviewed changes

bitwalker added 18 commits August 27, 2024 02:28

feat(debugger): implement support for program inputs

c5a9869

chore: rename ProgramInputs to DebuggerConfig

5395e9e

fix: address overhead of deserializing the stdlib

4debaf6

feat(debugger): implement 'finish' functionality from lldb

86cf309

ci: don't run cargo-miden integration tests twice

f2f4045

fix: expected output of cargo-miden integration test

a1d7585

chore: remove miden-diagnostics, start making midenc-session no-std-c…

c2b2f3b

…ompatible

fix: incorrect handling of -C/-Z and --emit=hir options

ae525ea

fix: support memory intrinsics from Rust

6d4353c

fix: inoperative --print-ir-after-*, add --print-cfg-after-*

7bd7c72

chore: add additional tracing to treeify pass

5215164

fix(sdk): improper handling of get_inputs vec after return into rust

23c4d2d

fix(sdk): be more explicit about alignment of felt/word types

50ff2c9

feat(sdk): introduce miden-sdk-alloc

ce99ba4

bitwalker force-pushed the bitwalker/memory-fixes branch from 4467c16 to ce99ba4 Compare August 27, 2024 06:34

bitwalker mentioned this pull request Aug 27, 2024

Loading global variable returns incorrect (zero) value #248

Closed

bitwalker merged commit 7ddd719 into next Aug 27, 2024
4 checks passed

bitwalker deleted the bitwalker/memory-fixes branch August 27, 2024 06:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: introduce experimental packaging format #278

feat: introduce experimental packaging format #278

bitwalker commented Aug 16, 2024

bitwalker commented Aug 19, 2024

greenhat left a comment

greenhat Aug 26, 2024

bitwalker Aug 27, 2024

feat: introduce experimental packaging format #278

feat: introduce experimental packaging format #278

Conversation

bitwalker commented Aug 16, 2024

bitwalker commented Aug 19, 2024

greenhat left a comment

Choose a reason for hiding this comment

greenhat Aug 26, 2024

Choose a reason for hiding this comment

bitwalker Aug 27, 2024

Choose a reason for hiding this comment