Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement support for wasm data segments #23

Merged
merged 3 commits into from
Sep 20, 2023

Conversation

bitwalker
Copy link
Contributor

Wasm has the concept of "data segments", essentially initializers for specific regions of memory, identified by offset from the start of linear memory, and a vector of bytes to be written starting at that offset. This is used to implement things like the rodata section you'd find in a typical ELF executable/library. Globals can then be exported with an address pointing into that segment, which can be useful for sharing the same data across many read-only globals, and for packing global data more efficiently.

Our implementation of this is very similar, except we also allow data segments to have a size larger than the initializer (byte vector), which has the same semantics as padding the initializer with zeroes out to that size.

--

This PR also implements formatting for segments, as well as globals, since formatting of the latter was missed in the PR which introduced them.

/cc @greenhat

@bitwalker bitwalker added codegen wasm WebAssembly frontend labels Sep 16, 2023
@bitwalker bitwalker self-assigned this Sep 16, 2023
Comment on lines 240 to 248
// If the two segments have the same size and offset, then
// if they match in all other respects, we're done. If they
// don't match, then we raise a mismatch error.
if segment.size == size {
let has_same_data = self.globals.contains_constant(&init);
if has_same_data && segment.readonly == readonly {
return Ok(());
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When can this situation occur? Does it correspond to two globals referring to the same data or something?

It seems strange that it would be necessary to handle this corner case here. I would expect the caller to keep track of whether a particular piece of data has already been assigned a segment, and only call declare_data_segment when a new segment is actually needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworked this a bit - where the code for tracking data segments lives anyway, not this specific case. Rather than having this code live as part of Module, I've moved it to a separate DataSegmentTable data structure similar to the GlobalVariableTable. I'll push those changes in the morning.

But to answer your question, we want to handle that data segments might be declared in multiple modules at once, by making sure those segments don't overlap, or if they do, that they perfectly overlap (i.e. they are duplicate declarations of the same segment). This makes it trivial to take all of the segments from multiple modules and insert them into the same segment-tracking data structure without having to account for duplicates separately.

Copy link
Contributor

@jjcnn jjcnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a single comment about a bit of code that surprised me, but I'll approve the PR under the assumption that the code is either fine or will be fixed before merging.

Wasm has the concept of "data segments", essentially initializers for
specific regions of memory, identified by offset from the start of
linear memory, and a vector of bytes to be written starting at that
offset. This is used to implement things like the `rodata` section you'd
find in a typical ELF executable/library. Globals can then be exported
with an address pointing into that segment, which can be useful for
sharing the same data across many read-only globals, and for packing
global data more efficiently.

Our implementation of this is very similar, except we also allow data
segments to have a size larger than the initializer (byte vector), which
has the same semantics as padding the initializer with zeroes out to
that size.
@bitwalker bitwalker merged commit 05367ef into main Sep 20, 2023
2 checks passed
@bitwalker bitwalker deleted the bitwalker/data-segments branch September 20, 2023 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
codegen wasm WebAssembly frontend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants