Skip to content

Commit

Permalink
add shared library support (#429)
Browse files Browse the repository at this point in the history
* add shared library support

This adds support for building WASI shared libraries per
https://github.com/WebAssembly/tool-conventions/blob/main/DynamicLinking.md.

For the time being, the goal is to allow "pseudo-dynamic" linking using the
Component Model per
https://github.com/WebAssembly/component-model/blob/main/design/mvp/examples/SharedEverythingDynamicLinking.md.
This requires all libraries to be available when the component is created, but
still allows runtime symbol resolution via `dlopen`/`dlsym` backed by a static
lookup table.  This is sufficient to support Python native extensions, for
example.  A complete demo using `wit-component` is available at
https://github.com/dicej/component-linking-demo.

This commit adds support for building `libc.so`, `libc++.so`, and `libc++abi.so`
alongside their static counterparts.

Notes:

- I had to refactor `errno` support a bit to avoid a spurious `_ZTH5errno` (AKA "thread-local initialization routine for errno") import in `libc++.so`.
- Long double print and scan are included by default in `libc.so` rather than in a separate library.
- `__main_argc_argv` is now a weak symbol since it's not relevant for reactors.
- `dlopen`/`dlsym` rely on a lookup table provided by the "dynamic" linker via `__wasm_set_libraries`.  Not all flags are supported yet, and unrecognized flags will result in an error.
- This requires https://reviews.llvm.org/D153293, which we will need to backport to LLVM 16 until 17 is released.  I'll open a `wasi-sdk` PR with that change and various Makefile tweaks to support shared libraries.
- `libc.so` is temporarily disabled for the `wasi-threads` build until someone can make `wasi_thread_start.s` position-independent.

Signed-off-by: Joel Dice <[email protected]>

build `-fPIC` .o files separately from non-`-fPIC` ones

This allows us to build both libc.so and libc.a without incurring indirection
penalties in the latter.

Signed-off-by: Joel Dice <[email protected]>

only build libc.so when explicitly requested

Shared library support in LLVM for non-Emscripten Wasm targets will be added in
version 17, which has not yet been released, so we should not attempt to build
libc.so by default (at least not yet).

Signed-off-by: Joel Dice <[email protected]>

remove dl.c

I'll open a separate PR for this later.

Signed-off-by: Joel Dice <[email protected]>

update `check-symbols` files

Signed-off-by: Joel Dice <[email protected]>

* generate separate .so files for emulated features

Signed-off-by: Joel Dice <[email protected]>

* revert errno changes in favor of a smaller change

@yamt pointed out there's an easier way to address the `_ZTH5errno` issue I
described in an earlier commit: use `_Thread_local` for both C and C++.  This
gives us a simpler ABI and avoids needing to import a thread-local initializer
for `errno` in libc++.so.

Signed-off-by: Joel Dice <[email protected]>

* remove redundant `$(OBJDIR)/%.long-double.pic.o` rule in Makefile

Signed-off-by: Joel Dice <[email protected]>

* consolidate libwasi-emulated-*.so into a single library

Signed-off-by: Joel Dice <[email protected]>

* add comment explaining use of `--whole-archive`

Signed-off-by: Joel Dice <[email protected]>

* Revert "remove redundant `$(OBJDIR)/%.long-double.pic.o` rule in Makefile"

This reverts commit dbe2cb1.

* move `__main_void` from __main_void.c to crt1-command.c

This and `__main_argc_argv` are only relevant for commands (not reactors), so it
makes sense to scope them accordingly.  In addition, the latter was being
imported from libc.so, forcing applications to provide it even if it wasn't
relevant.

Signed-off-by: Joel Dice <[email protected]>

* Revert "consolidate libwasi-emulated-*.so into a single library"

This reverts commit c651822.

* build crt1-*.o with `-fPIC`

This ensures they can be used in a PIE or PIC context.

Signed-off-by: Joel Dice <[email protected]>

* ignore `__memory_base` when checking undefined symbols

Whether this symbol appears varies between LLVM versions.

Signed-off-by: Joel Dice <[email protected]>

* Revert "move `__main_void` from __main_void.c to crt1-command.c"

This reverts commit f303835.

* add explanatory comments to __main_void.c

Signed-off-by: Joel Dice <[email protected]>

* add `__wasilibc_unmodified_upstream` and comment to `__lctrans_cur`

Signed-off-by: Joel Dice <[email protected]>

---------

Signed-off-by: Joel Dice <[email protected]>
  • Loading branch information
dicej authored Sep 28, 2023
1 parent 7b4705f commit d4dae89
Show file tree
Hide file tree
Showing 6 changed files with 98 additions and 16 deletions.
88 changes: 79 additions & 9 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -472,6 +472,52 @@ endif

default: finish

LIBC_SO_OBJS = $(patsubst %.o,%.pic.o,$(filter-out $(MUSL_PRINTSCAN_OBJS),$(LIBC_OBJS)))
MUSL_PRINTSCAN_LONG_DOUBLE_SO_OBJS = $(patsubst %.o,%.pic.o,$(MUSL_PRINTSCAN_LONG_DOUBLE_OBJS))
LIBWASI_EMULATED_MMAN_SO_OBJS = $(patsubst %.o,%.pic.o,$(LIBWASI_EMULATED_MMAN_OBJS))
LIBWASI_EMULATED_PROCESS_CLOCKS_SO_OBJS = $(patsubst %.o,%.pic.o,$(LIBWASI_EMULATED_PROCESS_CLOCKS_OBJS))
LIBWASI_EMULATED_GETPID_SO_OBJS = $(patsubst %.o,%.pic.o,$(LIBWASI_EMULATED_GETPID_OBJS))
LIBWASI_EMULATED_SIGNAL_SO_OBJS = $(patsubst %.o,%.pic.o,$(LIBWASI_EMULATED_SIGNAL_OBJS))
LIBWASI_EMULATED_SIGNAL_MUSL_SO_OBJS = $(patsubst %.o,%.pic.o,$(LIBWASI_EMULATED_SIGNAL_MUSL_OBJS))
BULK_MEMORY_SO_OBJS = $(patsubst %.o,%.pic.o,$(BULK_MEMORY_OBJS))
DLMALLOC_SO_OBJS = $(patsubst %.o,%.pic.o,$(DLMALLOC_OBJS))
LIBC_BOTTOM_HALF_ALL_SO_OBJS = $(patsubst %.o,%.pic.o,$(LIBC_BOTTOM_HALF_ALL_OBJS))
LIBC_TOP_HALF_ALL_SO_OBJS = $(patsubst %.o,%.pic.o,$(LIBC_TOP_HALF_ALL_OBJS))

PIC_OBJS = \
$(LIBC_SO_OBJS) \
$(MUSL_PRINTSCAN_LONG_DOUBLE_SO_OBJS) \
$(LIBWASI_EMULATED_MMAN_SO_OBJS) \
$(LIBWASI_EMULATED_PROCESS_CLOCKS_SO_OBJS) \
$(LIBWASI_EMULATED_GETPID_SO_OBJS) \
$(LIBWASI_EMULATED_SIGNAL_SO_OBJS) \
$(LIBWASI_EMULATED_SIGNAL_MUSL_SO_OBJS) \
$(BULK_MEMORY_SO_OBJS) \
$(DLMALLOC_SO_OBJS) \
$(LIBC_BOTTOM_HALF_ALL_SO_OBJS) \
$(LIBC_TOP_HALF_ALL_SO_OBJS) \
$(LIBC_BOTTOM_HALF_CRT_OBJS)

# TODO: Specify SDK version, e.g. libc.so.wasi-sdk-21, as SO_NAME once `wasm-ld`
# supports it.
#
# Note that we collect the object files for each shared library into a .a and
# link that using `--whole-archive` rather than pass the object files directly
# to CC. This is a workaround for a Windows command line size limitation. See
# the `%.a` rule below for details.
$(SYSROOT_LIB)/%.so: $(OBJDIR)/%.so.a $(BUILTINS_LIB)
$(CC) -nostdlib -shared -o $@ -Wl,--whole-archive $< -Wl,--no-whole-archive $(BUILTINS_LIB)

$(OBJDIR)/libc.so.a: $(LIBC_SO_OBJS) $(MUSL_PRINTSCAN_LONG_DOUBLE_SO_OBJS)

$(OBJDIR)/libwasi-emulated-mman.so.a: $(LIBWASI_EMULATED_MMAN_SO_OBJS)

$(OBJDIR)/libwasi-emulated-process-clocks.so.a: $(LIBWASI_EMULATED_PROCESS_CLOCKS_SO_OBJS)

$(OBJDIR)/libwasi-emulated-getpid.so.a: $(LIBWASI_EMULATED_GETPID_SO_OBJS)

$(OBJDIR)/libwasi-emulated-signal.so.a: $(LIBWASI_EMULATED_SIGNAL_SO_OBJS) $(LIBWASI_EMULATED_SIGNAL_MUSL_SO_OBJS)

$(SYSROOT_LIB)/libc.a: $(LIBC_OBJS)

$(SYSROOT_LIB)/libc-printscan-long-double.a: $(MUSL_PRINTSCAN_LONG_DOUBLE_OBJS)
Expand All @@ -497,6 +543,8 @@ $(SYSROOT_LIB)/libwasi-emulated-signal.a: $(LIBWASI_EMULATED_SIGNAL_OBJS) $(LIBW
# silently dropping the tail.
$(AR) crs $@ $(wordlist 800, 100000, $(sort $^))

$(PIC_OBJS): CFLAGS += -fPIC -fvisibility=default

$(MUSL_PRINTSCAN_OBJS): CFLAGS += \
-D__wasilibc_printscan_no_long_double \
-D__wasilibc_printscan_full_support_option="\"add -lc-printscan-long-double to the link command\""
Expand All @@ -507,15 +555,23 @@ $(MUSL_PRINTSCAN_NO_FLOATING_POINT_OBJS): CFLAGS += \

# TODO: apply -mbulk-memory globally, once
# https://github.com/llvm/llvm-project/issues/52618 is resolved
$(BULK_MEMORY_OBJS): CFLAGS += \
$(BULK_MEMORY_OBJS) $(BULK_MEMORY_SO_OBJS): CFLAGS += \
-mbulk-memory

$(BULK_MEMORY_OBJS): CFLAGS += \
$(BULK_MEMORY_OBJS) $(BULK_MEMORY_SO_OBJS): CFLAGS += \
-DBULK_MEMORY_THRESHOLD=$(BULK_MEMORY_THRESHOLD)

$(LIBWASI_EMULATED_SIGNAL_MUSL_OBJS): CFLAGS += \
$(LIBWASI_EMULATED_SIGNAL_MUSL_OBJS) $(LIBWASI_EMULATED_SIGNAL_MUSL_SO_OBJS): CFLAGS += \
-D_WASI_EMULATED_SIGNAL

$(OBJDIR)/%.long-double.pic.o: %.c include_dirs
@mkdir -p "$(@D)"
$(CC) $(CFLAGS) -MD -MP -o $@ -c $<

$(OBJDIR)/%.pic.o: %.c include_dirs
@mkdir -p "$(@D)"
$(CC) $(CFLAGS) -MD -MP -o $@ -c $<

$(OBJDIR)/%.long-double.o: %.c include_dirs
@mkdir -p "$(@D)"
$(CC) $(CFLAGS) -MD -MP -o $@ -c $<
Expand All @@ -534,17 +590,17 @@ $(OBJDIR)/%.o: %.s include_dirs

-include $(shell find $(OBJDIR) -name \*.d)

$(DLMALLOC_OBJS): CFLAGS += \
$(DLMALLOC_OBJS) $(DLMALLOC_SO_OBJS): CFLAGS += \
-I$(DLMALLOC_INC)

startup_files $(LIBC_BOTTOM_HALF_ALL_OBJS): CFLAGS += \
startup_files $(LIBC_BOTTOM_HALF_ALL_OBJS) $(LIBC_BOTTOM_HALF_ALL_SO_OBJS): CFLAGS += \
-I$(LIBC_BOTTOM_HALF_HEADERS_PRIVATE) \
-I$(LIBC_BOTTOM_HALF_CLOUDLIBC_SRC_INC) \
-I$(LIBC_BOTTOM_HALF_CLOUDLIBC_SRC) \
-I$(LIBC_TOP_HALF_MUSL_SRC_DIR)/include \
-I$(LIBC_TOP_HALF_MUSL_SRC_DIR)/internal

$(LIBC_TOP_HALF_ALL_OBJS) $(MUSL_PRINTSCAN_LONG_DOUBLE_OBJS) $(MUSL_PRINTSCAN_NO_FLOATING_POINT_OBJS) $(LIBWASI_EMULATED_SIGNAL_MUSL_OBJS): CFLAGS += \
$(LIBC_TOP_HALF_ALL_OBJS) $(LIBC_TOP_HALF_ALL_SO_OBJS) $(MUSL_PRINTSCAN_LONG_DOUBLE_OBJS) $(MUSL_PRINTSCAN_LONG_DOUBLE_SO_OBJS) $(MUSL_PRINTSCAN_NO_FLOATING_POINT_OBJS) $(LIBWASI_EMULATED_SIGNAL_MUSL_OBJS) $(LIBWASI_EMULATED_SIGNAL_MUSL_SO_OBJS): CFLAGS += \
-I$(LIBC_TOP_HALF_MUSL_SRC_DIR)/include \
-I$(LIBC_TOP_HALF_MUSL_SRC_DIR)/internal \
-I$(LIBC_TOP_HALF_MUSL_DIR)/arch/wasm32 \
Expand All @@ -558,7 +614,7 @@ $(LIBC_TOP_HALF_ALL_OBJS) $(MUSL_PRINTSCAN_LONG_DOUBLE_OBJS) $(MUSL_PRINTSCAN_NO
-Wno-dangling-else \
-Wno-unknown-pragmas

$(LIBWASI_EMULATED_PROCESS_CLOCKS_OBJS): CFLAGS += \
$(LIBWASI_EMULATED_PROCESS_CLOCKS_OBJS) $(LIBWASI_EMULATED_PROCESS_CLOCKS_SO_OBJS): CFLAGS += \
-I$(LIBC_BOTTOM_HALF_CLOUDLIBC_SRC)

# emmalloc uses a lot of pointer type-punning, which is UB under strict aliasing,
Expand Down Expand Up @@ -596,6 +652,20 @@ startup_files: include_dirs $(LIBC_BOTTOM_HALF_CRT_OBJS)
mkdir -p "$(SYSROOT_LIB)" && \
cp $(LIBC_BOTTOM_HALF_CRT_OBJS) "$(SYSROOT_LIB)"

# TODO: As of this writing, wasi_thread_start.s uses non-position-independent
# code, and I'm not sure how to make it position-independent. Once we've done
# that, we can enable libc.so for the wasi-threads build.
ifneq ($(THREAD_MODEL), posix)
LIBC_SO = \
$(SYSROOT_LIB)/libc.so \
$(SYSROOT_LIB)/libwasi-emulated-mman.so \
$(SYSROOT_LIB)/libwasi-emulated-process-clocks.so \
$(SYSROOT_LIB)/libwasi-emulated-getpid.so \
$(SYSROOT_LIB)/libwasi-emulated-signal.so
endif

libc_so: include_dirs $(LIBC_SO)

libc: include_dirs \
$(SYSROOT_LIB)/libc.a \
$(SYSROOT_LIB)/libc-printscan-long-double.a \
Expand Down Expand Up @@ -645,7 +715,7 @@ check-symbols: startup_files libc
for undef_sym in $$("$(NM)" --undefined-only "$(SYSROOT_LIB)"/libc.a "$(SYSROOT_LIB)"/libc-*.a "$(SYSROOT_LIB)"/*.o \
|grep ' U ' |sed 's/.* U //' |LC_ALL=C sort |uniq); do \
grep -q '\<'$$undef_sym'\>' "$(DEFINED_SYMBOLS)" || echo $$undef_sym; \
done | grep -v "^__mul" > "$(UNDEFINED_SYMBOLS)"
done | grep -E -v "^__mul|__memory_base" > "$(UNDEFINED_SYMBOLS)"
grep '^_*imported_wasi_' "$(UNDEFINED_SYMBOLS)" \
> "$(SYSROOT_LIB)/libc.imports"

Expand Down Expand Up @@ -728,4 +798,4 @@ clean:
$(RM) -r "$(OBJDIR)"
$(RM) -r "$(SYSROOT)"

.PHONY: default startup_files libc finish install include_dirs clean
.PHONY: default startup_files libc libc_so finish install include_dirs clean check-symbols
1 change: 0 additions & 1 deletion expected/wasm32-wasi-threads/undefined-symbols.txt
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,6 @@ __imported_wasi_snapshot_preview1_sock_shutdown
__imported_wasi_thread_spawn
__letf2
__lttf2
__main_argc_argv
__netf2
__stack_pointer
__subtf3
Expand Down
1 change: 0 additions & 1 deletion expected/wasm32-wasi/undefined-symbols.txt
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,6 @@ __imported_wasi_snapshot_preview1_sock_send
__imported_wasi_snapshot_preview1_sock_shutdown
__letf2
__lttf2
__main_argc_argv
__netf2
__stack_pointer
__subtf3
Expand Down
5 changes: 0 additions & 5 deletions libc-bottom-half/headers/public/__errno.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,11 @@
extern "C" {
#endif

#ifdef __cplusplus
extern thread_local int errno;
#else
extern _Thread_local int errno;
#endif

#define errno errno

#ifdef __cplusplus
}
#endif

#endif
11 changes: 11 additions & 0 deletions libc-bottom-half/sources/__main_void.c
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,22 @@
#include <sysexits.h>

// The user's `main` function, expecting arguments.
//
// Note that we make this a weak symbol so that it will have a
// `WASM_SYM_BINDING_WEAK` flag in libc.so, which tells the dynamic linker that
// it need not be defined (e.g. in reactor-style apps with no main function).
// See also the TODO comment on `__main_void` below.
__attribute__((__weak__))
int __main_argc_argv(int argc, char *argv[]);

// If the user's `main` function expects arguments, the compiler will rename
// it to `__main_argc_argv`, and this version will get linked in, which
// initializes the argument data and calls `__main_argc_argv`.
//
// TODO: Ideally this function would be defined in a crt*.o file and linked in
// as necessary by the Clang driver. However, moving it to crt1-command.c
// breaks `--no-gc-sections`, so we'll probably need to create a new file
// (e.g. crt0.o or crtend.o) and teach Clang to use it when needed.
__attribute__((__weak__, nodebug))
int __main_void(void) {
__wasi_errno_t err;
Expand Down
8 changes: 8 additions & 0 deletions libc-top-half/musl/src/internal/locale_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,15 @@ extern hidden const struct __locale_struct __c_dot_utf8_locale;
hidden const struct __locale_map *__get_locale(int, const char *);
hidden const char *__mo_lookup(const void *, size_t, const char *);
hidden const char *__lctrans(const char *, const struct __locale_map *);
#ifdef __wasilibc_unmodified_upstream
hidden const char *__lctrans_cur(const char *);
#else
// We make this visible in the wasi-libc build because
// libwasi-emulated-signal.so needs to import it from libc.so. If we ever
// decide to merge libwasi-emulated-signal.so into libc.so, this will no longer
// be necessary.
const char *__lctrans_cur(const char *);
#endif
hidden const char *__lctrans_impl(const char *, const struct __locale_map *);
hidden int __loc_is_allocated(locale_t);
hidden char *__gettextdomain(void);
Expand Down

0 comments on commit d4dae89

Please sign in to comment.