Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow merging modules and skipping DWARF reading on load #257

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions src/module/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ pub struct ModuleConfig {
pub(crate) on_parse:
Option<Box<dyn Fn(&mut Module, &IndicesToIds) -> Result<()> + Sync + Send + 'static>>,
pub(crate) on_instr_loc: Option<Box<dyn Fn(&usize) -> InstrLocId + Sync + Send + 'static>>,
pub(crate) skip_debug_sections: bool,
}

impl Clone for ModuleConfig {
Expand All @@ -32,6 +33,7 @@ impl Clone for ModuleConfig {
skip_producers_section: self.skip_producers_section,
skip_name_section: self.skip_name_section,
preserve_code_transform: self.preserve_code_transform,
skip_debug_sections: self.skip_debug_sections,

// ... and this is left empty.
on_parse: None,
Expand All @@ -54,6 +56,7 @@ impl fmt::Debug for ModuleConfig {
ref preserve_code_transform,
ref on_parse,
ref on_instr_loc,
ref skip_debug_sections,
} = self;

f.debug_struct("ModuleConfig")
Expand All @@ -69,6 +72,7 @@ impl fmt::Debug for ModuleConfig {
.field("preserve_code_transform", preserve_code_transform)
.field("on_parse", &on_parse.as_ref().map(|_| ".."))
.field("on_instr_loc", &on_instr_loc.as_ref().map(|_| ".."))
.field("skip_debug_sections", skip_debug_sections)
.finish()
}
}
Expand All @@ -79,6 +83,17 @@ impl ModuleConfig {
ModuleConfig::default()
}

/// Sets a flag to whether DWARF debug sections are read for this module.
pub fn read_dwarf(&mut self, dw: bool) -> &mut ModuleConfig {
self.skip_debug_sections = !dw;
return self;
}

/// Sets a flag to whether DWARF debug sections are skipped for this module.
pub fn skip_dwarf(&mut self, dw: bool) -> &mut ModuleConfig {
self.skip_debug_sections = dw;
return self;
}
/// Sets a flag to whether DWARF debug sections are generated for this
/// module.
///
Expand Down Expand Up @@ -208,6 +223,12 @@ impl ModuleConfig {
Module::parse(wasm, self)
}

/// Parses an in-memory WebAssembly file into an existing `Module` using this
/// configuration.
pub fn parse_into(&self, wasm: &[u8], pre: &mut Module) -> Result<()> {
pre.parse_in(wasm, self)
}

/// Parses a WebAssembly file into a `Module` using this configuration.
pub fn parse_file<P>(&self, path: P) -> Result<Module>
where
Expand Down
2 changes: 1 addition & 1 deletion src/module/functions/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -350,7 +350,7 @@ impl Module {
// necessary and it's a bottleneck!
let mut bodies = Vec::with_capacity(functions.len());
for (i, (body, mut validator)) in functions.into_iter().enumerate() {
let index = (num_imports + i) as u32;
let index = (indices.num_fun_imports + i) as u32;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be curious to know the reason this is necessary - which is incorrect above - funcs.arena.len() or functions.len()? I'm also not particularly confident this covers all index changes necessary to support the feature, if you do want to land this it would help to provide a little more analysis / feedback on why you think this is enough to make this PR complete.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because functions.len() is the length of the current module's functions, which includes the total amount of imports being merged and functions previously in the module. Previously, functions.len() worked because functions started out empty, but with merging, functions can contain values before imports.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying. Are there any other cases where imports being added later might affect indexing, or do you feel pretty confident this is the only one?

let id = indices.get_func(index)?;
let ty = match self.funcs.arena[id].kind {
FunctionKind::Uninitialized(ty) => ty,
Expand Down
4 changes: 4 additions & 0 deletions src/module/imports.rs
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,7 @@ impl Module {
entry.field.expect("module linking not supported"),
ty,
);
ids.num_fun_imports += 1;
ids.push_func(id.0);
}
wasmparser::ImportSectionEntryType::Table(t) => {
Expand All @@ -171,6 +172,7 @@ impl Module {
t.maximum,
ty,
);
ids.num_table_imports += 1;
ids.push_table(id.0);
}
wasmparser::ImportSectionEntryType::Memory(m) => {
Expand All @@ -184,6 +186,7 @@ impl Module {
m.initial as u32,
m.maximum.map(|m| m as u32),
);
ids.num_mem_imports += 1;
ids.push_memory(id.0);
}
wasmparser::ImportSectionEntryType::Global(g) => {
Expand All @@ -193,6 +196,7 @@ impl Module {
ValType::parse(&g.content_type)?,
g.mutable,
);
ids.num_global_imports += 1;
ids.push_global(id.0);
}
wasmparser::ImportSectionEntryType::Module(_)
Expand Down
71 changes: 48 additions & 23 deletions src/module/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ pub use crate::module::producers::ModuleProducers;
pub use crate::module::tables::{ModuleTables, Table, TableId};
pub use crate::module::types::ModuleTypes;
use crate::parse::IndicesToIds;
use crate::FunctionBuilder;
use anyhow::{bail, Context};
use id_arena::Id;
use log::warn;
Expand Down Expand Up @@ -128,9 +129,21 @@ impl Module {
ModuleConfig::new().parse(wasm)
}

/// Merge a module from the in-memory wasm buffer with the default configuration
pub fn merge_buffer(&mut self, wasm: &[u8]) -> Result<()> {
return self.parse_in(wasm, &ModuleConfig::default());
}

fn parse(wasm: &[u8], config: &ModuleConfig) -> Result<Module> {
let mut ret = Module::default();
ret.config = config.clone();
ret.parse_in(wasm, config)?;
return Ok(ret);
}
fn parse_in(&mut self, wasm: &[u8], config: &ModuleConfig) -> Result<()> {
// let mut ret = Module::default();
// ret.config = config.clone();
let old_start = self.start.take();
let mut indices = IndicesToIds::default();
let mut validator = Validator::new();
validator.wasm_features(WasmFeatures {
Expand All @@ -155,67 +168,67 @@ impl Module {
validator
.data_section(&s)
.context("failed to parse data section")?;
ret.parse_data(s, &mut indices)?;
self.parse_data(s, &mut indices)?;
}
Payload::TypeSection(s) => {
validator
.type_section(&s)
.context("failed to parse type section")?;
ret.parse_types(s, &mut indices)?;
self.parse_types(s, &mut indices)?;
}
Payload::ImportSection(s) => {
validator
.import_section(&s)
.context("failed to parse import section")?;
ret.parse_imports(s, &mut indices)?;
self.parse_imports(s, &mut indices)?;
}
Payload::TableSection(s) => {
validator
.table_section(&s)
.context("failed to parse table section")?;
ret.parse_tables(s, &mut indices)?;
self.parse_tables(s, &mut indices)?;
}
Payload::MemorySection(s) => {
validator
.memory_section(&s)
.context("failed to parse memory section")?;
ret.parse_memories(s, &mut indices)?;
self.parse_memories(s, &mut indices)?;
}
Payload::GlobalSection(s) => {
validator
.global_section(&s)
.context("failed to parse global section")?;
ret.parse_globals(s, &mut indices)?;
self.parse_globals(s, &mut indices)?;
}
Payload::ExportSection(s) => {
validator
.export_section(&s)
.context("failed to parse export section")?;
ret.parse_exports(s, &mut indices)?;
self.parse_exports(s, &mut indices)?;
}
Payload::ElementSection(s) => {
validator
.element_section(&s)
.context("failed to parse element section")?;
ret.parse_elements(s, &mut indices)?;
self.parse_elements(s, &mut indices)?;
}
Payload::StartSection { func, range, .. } => {
validator.start_section(func, &range)?;
ret.start = Some(indices.get_func(func)?);
self.start = Some(indices.get_func(func)?);
}
Payload::FunctionSection(s) => {
validator
.function_section(&s)
.context("failed to parse function section")?;
ret.declare_local_functions(s, &mut indices)?;
self.declare_local_functions(s, &mut indices)?;
}
Payload::DataCountSection { count, range } => {
validator.data_count_section(count, &range)?;
ret.reserve_data(count, &mut indices);
self.reserve_data(count, &mut indices);
}
Payload::CodeSectionStart { count, range, .. } => {
validator.code_section_start(count, &range)?;
ret.funcs.code_section_offset = range.start;
self.funcs.code_section_offset = range.start;
}
Payload::CodeSectionEntry(body) => {
let validator = validator.code_section_entry()?;
Expand All @@ -230,10 +243,10 @@ impl Module {
let result = match name {
"producers" => wasmparser::ProducersSectionReader::new(data, data_offset)
.map_err(anyhow::Error::from)
.and_then(|s| ret.parse_producers_section(s)),
.and_then(|s| self.parse_producers_section(s)),
"name" => wasmparser::NameSectionReader::new(data, data_offset)
.map_err(anyhow::Error::from)
.and_then(|r| ret.parse_name_section(r, &indices)),
.and_then(|r| self.parse_name_section(r, &indices)),
_ => {
log::debug!("parsing custom section `{}`", name);
if name.starts_with(".debug") {
Expand All @@ -242,7 +255,7 @@ impl Module {
data: data.to_vec(),
});
} else {
ret.customs.add(RawCustomSection {
self.customs.add(RawCustomSection {
name: name.to_string(),
data: data.to_vec(),
});
Expand Down Expand Up @@ -293,25 +306,37 @@ impl Module {
}
}

ret.parse_local_functions(
self.parse_local_functions(
local_functions,
&mut indices,
config.on_instr_loc.as_ref().map(|f| f.as_ref()),
)
.context("failed to parse code section")?;

ret.parse_debug_sections(debug_sections)
.context("failed to parse debug data section")?;

ret.producers
if !config.skip_debug_sections {
self.parse_debug_sections(debug_sections)
.context("failed to parse debug data section")?;
}
self.producers
.add_processed_by("walrus", env!("CARGO_PKG_VERSION"));

match self.start {
Some(a) => {
if let Some(r) = old_start {
let mut new_builder = FunctionBuilder::new(&mut self.types, &[], &[]);
new_builder.func_body().call(r).call(a).return_();
let new = new_builder.local_func(vec![]);
let new = self.funcs.add_local(new);
self.start = Some(new);
}
}
None => self.start = old_start,
}
if let Some(on_parse) = &config.on_parse {
on_parse(&mut ret, &indices)?;
on_parse(self, &indices)?;
}

log::debug!("parse complete");
Ok(ret)
Ok(())
}

/// Emit this module into a `.wasm` file at the given path.
Expand Down
4 changes: 4 additions & 0 deletions src/parse.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ pub struct IndicesToIds {
elements: Vec<ElementId>,
data: Vec<DataId>,
locals: IdHashMap<Function, Vec<LocalId>>,
pub(crate) num_fun_imports: usize,
pub(crate) num_mem_imports: usize,
pub(crate) num_global_imports: usize,
pub(crate) num_table_imports: usize,
}

macro_rules! define_push_get {
Expand Down
Loading