Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm linker: aggressive rewrite towards Data-Oriented Design #22220

Open
wants to merge 84 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
5680edc
wasm linker: aggressive DODification
andrewrk Nov 5, 2024
32c024f
macho linker: conform to explicit error sets
andrewrk Dec 3, 2024
78a49c5
remove "FIXME" from codebase
andrewrk Dec 3, 2024
7cfeefe
macho linker conforms to explicit error sets, again
andrewrk Dec 4, 2024
18d7cbd
elf linker: conform to explicit error sets
andrewrk Dec 4, 2024
3806d87
rework error handling in the backends
andrewrk Dec 4, 2024
6075e92
compiler: add type safety for export indices
andrewrk Dec 5, 2024
04b068b
std.array_list: tiny refactor for pleasure
andrewrk Dec 6, 2024
c671e3f
rewrite wasm/Emit.zig
andrewrk Dec 6, 2024
cd0ca50
wasm codegen: fix some compilation errors
andrewrk Dec 7, 2024
3014317
wasm: implement errors_len as a MIR opcode with no linker involvement
andrewrk Dec 7, 2024
413ba6e
wasm codegen: switch on bool instead of int
andrewrk Dec 7, 2024
2ef21f1
wasm codegen: rename func: CodeGen to cg: CodeGen
andrewrk Dec 7, 2024
32b1115
wasm: move error_name lowering to Emit phase
andrewrk Dec 7, 2024
fc13a31
wasm: use call_intrinsic MIR instruction
andrewrk Dec 8, 2024
716a8fe
switch to ArrayListUnmanaged for machine code
andrewrk Dec 8, 2024
26864f6
wasm: fix many compilation errors
andrewrk Dec 9, 2024
508d04e
wasm linker: support export section as implicit symbols
andrewrk Dec 12, 2024
2496569
frontend: add const to more Zcu pointers
andrewrk Dec 12, 2024
d8130ed
wasm linker: implement name, module name, and type for function imports
andrewrk Dec 12, 2024
a0293ff
wasm linker: flush implemented up to the export section
andrewrk Dec 12, 2024
c484b4e
wasm linker: flush export section
andrewrk Dec 12, 2024
fc27061
wasm linker: finish the flush function
andrewrk Dec 13, 2024
73335fc
fix compilation when enabling llvm
andrewrk Dec 13, 2024
3f977a9
cmake: remove deleted file
andrewrk Dec 13, 2024
c085f6d
add dev env for wasm
andrewrk Dec 14, 2024
cc011ae
remove bad deinit
andrewrk Dec 15, 2024
502e822
wasm codegen: fix lowering of 32/64 float rt calls
andrewrk Dec 16, 2024
49cdda2
wasm codegen: remove dependency on PerThread where possible
andrewrk Dec 16, 2024
ff80e3e
wasm linker fixes
andrewrk Dec 16, 2024
ce35880
wasm linker: implement name subsection
andrewrk Dec 16, 2024
58a3683
fix replaceVecSectionHeader
andrewrk Dec 16, 2024
63c2f6f
std.Thread: don't export wasi_thread_start in single-threaded mode
andrewrk Dec 16, 2024
c6667c3
wasm linker: implement type index method
andrewrk Dec 16, 2024
0ff77f4
wasm linker: implement missing logic
andrewrk Dec 18, 2024
6a334da
complete wasm.Emit implementation
andrewrk Dec 18, 2024
b0e24b2
fix calculation of nav alignment
andrewrk Dec 18, 2024
d3be71c
wasm codegen: fix wrong union field for locals
andrewrk Dec 18, 2024
08cbc1f
add safety for calling functions that get virtual addrs
andrewrk Dec 18, 2024
519f466
wasm linker: add __zig_error_name_table data when needed
andrewrk Dec 18, 2024
999a13b
wasm codegen: fix extra index not relative
andrewrk Dec 18, 2024
153a5c8
wasm linker: fix calling imported functions
andrewrk Dec 19, 2024
3f8e874
std.ArrayHashMap: allow passing empty values array
andrewrk Dec 19, 2024
9e8cbf7
wasm linker: fix data segments memory flow
andrewrk Dec 19, 2024
c994b70
wasm linker: handle extern functions in updateNav
andrewrk Dec 19, 2024
c412dec
wasm linker: allow undefined imports when lib name is provided
andrewrk Dec 19, 2024
4bf9f79
wasm codegen: fix call_indirect
andrewrk Dec 19, 2024
f6a38e8
wasm linker: fix eliding empty data segments
andrewrk Dec 19, 2024
1f362ff
wasm linker: implement data fixups
andrewrk Dec 19, 2024
b487403
wasm linker: avoid recursion in lowerZcuData
andrewrk Dec 19, 2024
eeb1e27
wasm linker: also call lowerZcuData in updateFunc
andrewrk Dec 20, 2024
4c0648b
wasm linker: initialize the data segments table in flush
andrewrk Dec 20, 2024
bcdfb92
wasm linker: zcu data fixups are already applied
andrewrk Dec 20, 2024
ed52860
implement error table and error names data segments
andrewrk Dec 20, 2024
9f429b1
wasm linker: fix data section in flush
andrewrk Dec 21, 2024
147dd1c
implement the prelink phase in the frontend
andrewrk Dec 21, 2024
3b495a7
wasm linker: implement stack pointer global
andrewrk Dec 21, 2024
bdf9031
std.io: remove the "temporary workaround" for stage2_aarch64
andrewrk Dec 21, 2024
8bfdfb5
wasm linker: implement indirect function calls
andrewrk Dec 21, 2024
2a09f97
fix stack pointer initialized to wrong vaddr
andrewrk Dec 21, 2024
bb9fabc
use fixed writer in more places
andrewrk Dec 21, 2024
33330fd
wasm linker: fix missing function type entry for import
andrewrk Dec 22, 2024
d415d5b
wasm linker: fix active data segment offset value
andrewrk Dec 22, 2024
1a607de
Compilation: account for C objects and resources in prelink
andrewrk Dec 22, 2024
73868e6
wasm linker: fix relocation parsing
andrewrk Dec 23, 2024
f7218a1
wasm linker: fix crashes when parsing compiler_rt
andrewrk Dec 24, 2024
1cbdd7b
fix missing missing entry symbol error when no zcu
andrewrk Dec 24, 2024
25df769
resolve merge conflicts
andrewrk Dec 27, 2024
b2174ef
wasm linker: fix global imports in objects
andrewrk Dec 28, 2024
41af2f5
can't use source location until return from this function
andrewrk Dec 29, 2024
6a513c8
wasm linker: fix table imports in objects
andrewrk Dec 29, 2024
3613e29
fix bad archive name calculation
andrewrk Dec 29, 2024
12498dc
wasm linker: chase relocations for references
andrewrk Dec 30, 2024
36071a3
wasm linker: improve error messages by making source locations more lazy
andrewrk Dec 30, 2024
98c3c9d
wasm object parsing: fix handling of weak functions and globals
andrewrk Dec 30, 2024
6cbd271
type checking for synthetic functions
andrewrk Dec 30, 2024
a3b8771
implement function relocations
andrewrk Dec 31, 2024
ba00ec7
wasm linker: implement __wasm_call_ctors
andrewrk Dec 31, 2024
f453fdc
wasm linker: implement data symbols
andrewrk Jan 4, 2025
fdd7313
wasm linker: implement data relocs
andrewrk Jan 5, 2025
03800dd
wasm linker: implement __wasm_init_memory
andrewrk Jan 5, 2025
857c88a
fix merge conflicts with updating line numbers
andrewrk Jan 6, 2025
527a992
wasm linker: emit __heap_base and __heap_end globals and datas
andrewrk Jan 6, 2025
6f27a37
wasm linker: apply object relocations to data segments
andrewrk Jan 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -643,9 +643,8 @@ set(ZIG_STAGE2_SOURCES
src/link/StringTable.zig
src/link/Wasm.zig
src/link/Wasm/Archive.zig
src/link/Wasm/Flush.zig
src/link/Wasm/Object.zig
src/link/Wasm/Symbol.zig
src/link/Wasm/ZigObject.zig
src/link/aarch64.zig
src/link/riscv.zig
src/link/table_section.zig
Expand Down
2 changes: 1 addition & 1 deletion lib/std/Build/Step/CheckObject.zig
Original file line number Diff line number Diff line change
Expand Up @@ -2682,7 +2682,7 @@ const WasmDumper = struct {
else => unreachable,
}
const end_opcode = try std.leb.readUleb128(u8, reader);
if (end_opcode != std.wasm.opcode(.end)) {
if (end_opcode != @intFromEnum(std.wasm.Opcode.end)) {
return step.fail("expected 'end' opcode in init expression", .{});
}
}
Expand Down
6 changes: 6 additions & 0 deletions lib/std/Target.zig
Original file line number Diff line number Diff line change
Expand Up @@ -1219,6 +1219,12 @@ pub const Cpu = struct {
} else true;
}

pub fn count(set: Set) std.math.IntFittingRange(0, needed_bit_count) {
var sum: usize = 0;
for (set.ints) |x| sum += @popCount(x);
return @intCast(sum);
}

pub fn isEnabled(set: Set, arch_feature_index: Index) bool {
const usize_index = arch_feature_index / @bitSizeOf(usize);
const bit_index: ShiftInt = @intCast(arch_feature_index % @bitSizeOf(usize));
Expand Down
13 changes: 8 additions & 5 deletions lib/std/Thread.zig
Original file line number Diff line number Diff line change
Expand Up @@ -1018,12 +1018,15 @@ const WasiThreadImpl = struct {
return .{ .thread = &instance.thread };
}

/// Bootstrap procedure, called by the host environment after thread creation.
export fn wasi_thread_start(tid: i32, arg: *Instance) void {
if (builtin.single_threaded) {
// ensure function is not analyzed in single-threaded mode
return;
comptime {
if (!builtin.single_threaded) {
@export(wasi_thread_start, .{ .name = "wasi_thread_start" });
}
}

/// Called by the host environment after thread creation.
fn wasi_thread_start(tid: i32, arg: *Instance) callconv(.c) void {
comptime assert(!builtin.single_threaded);
__set_stack_pointer(arg.thread.memory.ptr + arg.stack_offset);
__wasm_init_tls(arg.thread.memory.ptr + arg.tls_offset);
@atomicStore(u32, &WasiThreadImpl.tls_thread_id, @intCast(tid), .seq_cst);
Expand Down
5 changes: 4 additions & 1 deletion lib/std/array_hash_map.zig
Original file line number Diff line number Diff line change
Expand Up @@ -641,10 +641,13 @@ pub fn ArrayHashMapUnmanaged(
return self;
}

/// An empty `value_list` may be passed, in which case the values array becomes `undefined`.
pub fn reinit(self: *Self, gpa: Allocator, key_list: []const K, value_list: []const V) Oom!void {
try self.entries.resize(gpa, key_list.len);
@memcpy(self.keys(), key_list);
if (@sizeOf(V) != 0) {
if (value_list.len == 0) {
@memset(self.values(), undefined);
} else {
assert(key_list.len == value_list.len);
@memcpy(self.values(), value_list);
}
Expand Down
6 changes: 2 additions & 4 deletions lib/std/array_list.zig
Original file line number Diff line number Diff line change
Expand Up @@ -267,8 +267,7 @@ pub fn ArrayListAligned(comptime T: type, comptime alignment: ?u29) type {
/// Never invalidates element pointers.
/// Asserts that the list can hold one additional item.
pub fn appendAssumeCapacity(self: *Self, item: T) void {
const new_item_ptr = self.addOneAssumeCapacity();
new_item_ptr.* = item;
self.addOneAssumeCapacity().* = item;
}

/// Remove the element at index `i`, shift elements after index
Expand Down Expand Up @@ -879,8 +878,7 @@ pub fn ArrayListAlignedUnmanaged(comptime T: type, comptime alignment: ?u29) typ
/// Never invalidates element pointers.
/// Asserts that the list can hold one additional item.
pub fn appendAssumeCapacity(self: *Self, item: T) void {
const new_item_ptr = self.addOneAssumeCapacity();
new_item_ptr.* = item;
self.addOneAssumeCapacity().* = item;
}

/// Remove the element at index `i` from the list and return its value.
Expand Down
12 changes: 0 additions & 12 deletions lib/std/io.zig
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,6 @@ const Allocator = std.mem.Allocator;

fn getStdOutHandle() posix.fd_t {
if (is_windows) {
if (builtin.zig_backend == .stage2_aarch64) {
// TODO: this is just a temporary workaround until we advance aarch64 backend further along.
return windows.GetStdHandle(windows.STD_OUTPUT_HANDLE) catch windows.INVALID_HANDLE_VALUE;
}
return windows.peb().ProcessParameters.hStdOutput;
}

Expand All @@ -36,10 +32,6 @@ pub fn getStdOut() File {

fn getStdErrHandle() posix.fd_t {
if (is_windows) {
if (builtin.zig_backend == .stage2_aarch64) {
// TODO: this is just a temporary workaround until we advance aarch64 backend further along.
return windows.GetStdHandle(windows.STD_ERROR_HANDLE) catch windows.INVALID_HANDLE_VALUE;
}
return windows.peb().ProcessParameters.hStdError;
}

Expand All @@ -56,10 +48,6 @@ pub fn getStdErr() File {

fn getStdInHandle() posix.fd_t {
if (is_windows) {
if (builtin.zig_backend == .stage2_aarch64) {
// TODO: this is just a temporary workaround until we advance aarch64 backend further along.
return windows.GetStdHandle(windows.STD_INPUT_HANDLE) catch windows.INVALID_HANDLE_VALUE;
}
return windows.peb().ProcessParameters.hStdInput;
}

Expand Down
184 changes: 5 additions & 179 deletions lib/std/wasm.zig
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@
const std = @import("std.zig");
const testing = std.testing;

// TODO: Add support for multi-byte ops (e.g. table operations)

/// Wasm instruction opcodes
///
/// All instructions are defined as per spec:
Expand Down Expand Up @@ -195,27 +193,6 @@ pub const Opcode = enum(u8) {
_,
};

/// Returns the integer value of an `Opcode`. Used by the Zig compiler
/// to write instructions to the wasm binary file
pub fn opcode(op: Opcode) u8 {
return @intFromEnum(op);
}

test "opcodes" {
// Ensure our opcodes values remain intact as certain values are skipped due to them being reserved
const i32_const = opcode(.i32_const);
const end = opcode(.end);
const drop = opcode(.drop);
const local_get = opcode(.local_get);
const i64_extend32_s = opcode(.i64_extend32_s);

try testing.expectEqual(@as(u16, 0x41), i32_const);
try testing.expectEqual(@as(u16, 0x0B), end);
try testing.expectEqual(@as(u16, 0x1A), drop);
try testing.expectEqual(@as(u16, 0x20), local_get);
try testing.expectEqual(@as(u16, 0xC4), i64_extend32_s);
}

/// Opcodes that require a prefix `0xFC`.
/// Each opcode represents a varuint32, meaning
/// they are encoded as leb128 in binary.
Expand All @@ -241,12 +218,6 @@ pub const MiscOpcode = enum(u32) {
_,
};

/// Returns the integer value of an `MiscOpcode`. Used by the Zig compiler
/// to write instructions to the wasm binary file
pub fn miscOpcode(op: MiscOpcode) u32 {
return @intFromEnum(op);
}

/// Simd opcodes that require a prefix `0xFD`.
/// Each opcode represents a varuint32, meaning
/// they are encoded as leb128 in binary.
Expand Down Expand Up @@ -512,12 +483,6 @@ pub const SimdOpcode = enum(u32) {
f32x4_relaxed_dot_bf16x8_add_f32x4 = 0x114,
};

/// Returns the integer value of an `SimdOpcode`. Used by the Zig compiler
/// to write instructions to the wasm binary file
pub fn simdOpcode(op: SimdOpcode) u32 {
return @intFromEnum(op);
}

/// Atomic opcodes that require a prefix `0xFE`.
/// Each opcode represents a varuint32, meaning
/// they are encoded as leb128 in binary.
Expand Down Expand Up @@ -592,12 +557,6 @@ pub const AtomicsOpcode = enum(u32) {
i64_atomic_rmw32_cmpxchg_u = 0x4E,
};

/// Returns the integer value of an `AtomicsOpcode`. Used by the Zig compiler
/// to write instructions to the wasm binary file
pub fn atomicsOpcode(op: AtomicsOpcode) u32 {
return @intFromEnum(op);
}

/// Enum representing all Wasm value types as per spec:
/// https://webassembly.github.io/spec/core/binary/types.html
pub const Valtype = enum(u8) {
Expand All @@ -608,53 +567,24 @@ pub const Valtype = enum(u8) {
v128 = 0x7B,
};

/// Returns the integer value of a `Valtype`
pub fn valtype(value: Valtype) u8 {
return @intFromEnum(value);
}

/// Reference types, where the funcref references to a function regardless of its type
/// and ref references an object from the embedder.
pub const RefType = enum(u8) {
funcref = 0x70,
externref = 0x6F,
};

/// Returns the integer value of a `Reftype`
pub fn reftype(value: RefType) u8 {
return @intFromEnum(value);
}

test "valtypes" {
const _i32 = valtype(.i32);
const _i64 = valtype(.i64);
const _f32 = valtype(.f32);
const _f64 = valtype(.f64);

try testing.expectEqual(@as(u8, 0x7F), _i32);
try testing.expectEqual(@as(u8, 0x7E), _i64);
try testing.expectEqual(@as(u8, 0x7D), _f32);
try testing.expectEqual(@as(u8, 0x7C), _f64);
}

/// Limits classify the size range of resizeable storage associated with memory types and table types.
pub const Limits = struct {
flags: u8,
flags: Flags,
min: u32,
max: u32,

pub const Flags = enum(u8) {
WASM_LIMITS_FLAG_HAS_MAX = 0x1,
WASM_LIMITS_FLAG_IS_SHARED = 0x2,
pub const Flags = packed struct(u8) {
has_max: bool,
is_shared: bool,
reserved: u6 = 0,
};

pub fn hasFlag(limits: Limits, flag: Flags) bool {
return limits.flags & @intFromEnum(flag) != 0;
}

pub fn setFlag(limits: *Limits, flag: Flags) void {
limits.flags |= @intFromEnum(flag);
}
};

/// Initialization expressions are used to set the initial value on an object
Expand All @@ -667,18 +597,6 @@ pub const InitExpression = union(enum) {
global_get: u32,
};

/// Represents a function entry, holding the index to its type
pub const Func = struct {
type_index: u32,
};

/// Tables are used to hold pointers to opaque objects.
/// This can either by any function, or an object from the host.
pub const Table = struct {
limits: Limits,
reftype: RefType,
};

/// Describes the layout of the memory where `min` represents
/// the minimal amount of pages, and the optional `max` represents
/// the max pages. When `null` will allow the host to determine the
Expand All @@ -687,88 +605,6 @@ pub const Memory = struct {
limits: Limits,
};

/// Represents the type of a `Global` or an imported global.
pub const GlobalType = struct {
valtype: Valtype,
mutable: bool,
};

pub const Global = struct {
global_type: GlobalType,
init: InitExpression,
};

/// Notates an object to be exported from wasm
/// to the host.
pub const Export = struct {
name: []const u8,
kind: ExternalKind,
index: u32,
};

/// Element describes the layout of the table that can
/// be found at `table_index`
pub const Element = struct {
table_index: u32,
offset: InitExpression,
func_indexes: []const u32,
};

/// Imports are used to import objects from the host
pub const Import = struct {
module_name: []const u8,
name: []const u8,
kind: Kind,

pub const Kind = union(ExternalKind) {
function: u32,
table: Table,
memory: Limits,
global: GlobalType,
};
};

/// `Type` represents a function signature type containing both
/// a slice of parameters as well as a slice of return values.
pub const Type = struct {
params: []const Valtype,
returns: []const Valtype,

pub fn format(self: Type, comptime fmt: []const u8, opt: std.fmt.FormatOptions, writer: anytype) !void {
if (fmt.len != 0) std.fmt.invalidFmtError(fmt, self);
_ = opt;
try writer.writeByte('(');
for (self.params, 0..) |param, i| {
try writer.print("{s}", .{@tagName(param)});
if (i + 1 != self.params.len) {
try writer.writeAll(", ");
}
}
try writer.writeAll(") -> ");
if (self.returns.len == 0) {
try writer.writeAll("nil");
} else {
for (self.returns, 0..) |return_ty, i| {
try writer.print("{s}", .{@tagName(return_ty)});
if (i + 1 != self.returns.len) {
try writer.writeAll(", ");
}
}
}
}

pub fn eql(self: Type, other: Type) bool {
return std.mem.eql(Valtype, self.params, other.params) and
std.mem.eql(Valtype, self.returns, other.returns);
}

pub fn deinit(self: *Type, gpa: std.mem.Allocator) void {
gpa.free(self.params);
gpa.free(self.returns);
self.* = undefined;
}
};

/// Wasm module sections as per spec:
/// https://webassembly.github.io/spec/core/binary/modules.html
pub const Section = enum(u8) {
Expand All @@ -788,11 +624,6 @@ pub const Section = enum(u8) {
_,
};

/// Returns the integer value of a given `Section`
pub fn section(val: Section) u8 {
return @intFromEnum(val);
}

/// The kind of the type when importing or exporting to/from the host environment.
/// https://webassembly.github.io/spec/core/syntax/modules.html
pub const ExternalKind = enum(u8) {
Expand All @@ -802,11 +633,6 @@ pub const ExternalKind = enum(u8) {
global,
};

/// Returns the integer value of a given `ExternalKind`
pub fn externalKind(val: ExternalKind) u8 {
return @intFromEnum(val);
}

/// Defines the enum values for each subsection id for the "Names" custom section
/// as described by:
/// https://webassembly.github.io/spec/core/appendix/custom.html?highlight=name#name-section
Expand Down
Loading
Loading