Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft of WASI Preview 2 support, featuring wasi-sockets #449

Closed
wants to merge 68 commits into from

Conversation

dicej
Copy link
Contributor

@dicej dicej commented Dec 12, 2023

Per #447, this is an (in-progress) implementation of wasi-sockets, including a new wasm32-wasi-preview2 target.

Currently, TCP, UDP, and name resolution are mostly implemented (minus a few sockopt options, e.g. SO_RCVTIMEO).

Note that I've changed SYSROOT_INC from $(SYSROOT)/include to $(SYSROOT)/include/$(TARGET_TRIPLE) since the capabilities of Preview 2 are much wider than Preview 1, and the header files reflect that. Also, wasm32-wasi and wasm32-wasi-threads sharing the same include directory was already awkward (e.g. the Makefile was ambivalent about whether pthread.h should be present or not).

See https://github.com/dicej/wasi-sockets-tests for examples of using wasi-sockets in Rust and Python via their respective standard libraries, both of which are built on top of wasi-libc.

Big thanks to @badeend for doing most of the work here!

dicej and others added 30 commits November 27, 2023 10:24
…re, now that sockets are fully based on preview2
Signed-off-by: Joel Dice <[email protected]>
…tion of wit-bindgen to generate shorter names. The full command used to generate the bindings was:

wit-bindgen c --world imports \
    --rename wasi:clocks/[email protected]=monotonic_clock \
    --rename wasi:clocks/[email protected]=wall_clock \
    --rename wasi:filesystem/[email protected]=filesystem_preopens \
    --rename wasi:filesystem/[email protected]=filesystem \
    --rename wasi:io/[email protected]=io_error \
    --rename wasi:io/[email protected]=poll \
    --rename wasi:io/[email protected]=streams \
    --rename wasi:random/[email protected]=random_insecure_seed \
    --rename wasi:random/[email protected]=random_insecure \
    --rename wasi:random/[email protected]=random \
    --rename wasi:sockets/[email protected]=instance_network \
    --rename wasi:sockets/[email protected]=ip_name_lookup \
    --rename wasi:sockets/[email protected]=network \
    --rename wasi:sockets/[email protected]=tcp_create_socket \
    --rename wasi:sockets/[email protected]=tcp \
    --rename wasi:sockets/[email protected]=udp_create_socket \
    --rename wasi:sockets/[email protected]=udp \
    --rename wasi:cli/[email protected]=environment \
    --rename wasi:cli/[email protected]=exit \
    --rename wasi:cli/[email protected]=stdin \
    --rename wasi:cli/[email protected]=stdout \
    --rename wasi:cli/[email protected]=stderr \
    ./wit
The tag and union always need to be updated in tandem. Having them close together helps to clarify the relationship between the two types.
- On input addresses: check against the expected address family and return EAFNOSUPPORT when mismatch
- On output addresses: perform validations _before_ making the actual syscall.
Signed-off-by: Joel Dice <[email protected]>
@dicej
Copy link
Contributor Author

dicej commented Dec 18, 2023

Is there some way we can make the including of pthread.h dependent on the current target perhaps?

It looks like libc++ has _LIBCPP_HAS_THREAD_API_PTHREAD .. can that depend on the target triple?

Yeah, that was my initial thought when starting this project, but when I ran e.g. clang --target=wasm32-wasi-preview2 -dM -E -x c /dev/null I couldn't find any preprocessor definitions that told me which triple I was building for, so I abandoned that approach. Did I miss something? Or are you thinking we should add something to clang?

@sbc100
Copy link
Member

sbc100 commented Dec 18, 2023

Is there some way we can make the including of pthread.h dependent on the current target perhaps?
It looks like libc++ has _LIBCPP_HAS_THREAD_API_PTHREAD .. can that depend on the target triple?

Yeah, that was my initial thought when starting this project, but when I ran e.g. clang --target=wasm32-wasi-preview2 -dM -E -x c /dev/null I couldn't find any preprocessor definitions that told me which triple I was building for, so I abandoned that approach. Did I miss something? Or are you thinking we should add something to clang?

I though we were talking about pthread.h? Does clang --target=wasm32-wasi-threads also not have a way to detect it?

@dicej
Copy link
Contributor Author

dicej commented Dec 18, 2023

I though we were talking about pthread.h? Does clang --target=wasm32-wasi-threads also not have a way to detect it?

Same story with clang --target=wasm32-wasi-threads -dM -E -x c /dev/null. Again, I could be missing something.

And, yes, we're talking about pthread.h, but I'm also thinking of headers like netdb.h which the Preview 1 targets should not have but the Preview 2 target(s) should. Not that libc++ uses netdb.h, but I'm thinking of what would be required to switch back to a completely shared include dir.

Also, stub out other netdb functions so they return errors instead of aborting.

Signed-off-by: Joel Dice <[email protected]>
Signed-off-by: Joel Dice <[email protected]>
Signed-off-by: Joel Dice <[email protected]>
dicej added a commit to dicej/wasi-libc that referenced this pull request Dec 20, 2023
Currently, this is identical to the `wasm32-wasi` in all but name.  See WebAssembly#449 for
the next step, which is to incrementally add Preview 2 features,
e.g. `wasi-sockets`.  Per the discussion in that PR, I've split the
`wasi-sysroot/include` directory into per-target directories.  Eventually, we'll
want to build a separate sysroot for each target, but there's currently
uncertainty about how to configure the default sysroot for e.g. clang, so we're
not tackling that yet.

See also WebAssembly#447 for further details.

Signed-off-by: Joel Dice <[email protected]>
dicej added a commit to dicej/wasi-libc that referenced this pull request Dec 20, 2023
Currently, this is identical to the `wasm32-wasi` in all but name.  See WebAssembly#449 for
the next step, which is to incrementally add Preview 2 features,
e.g. `wasi-sockets`.  Per the discussion in that PR, I've split the
`wasi-sysroot/include` directory into per-target directories.  Eventually, we'll
want to build a separate sysroot for each target, but there's currently
uncertainty about how to configure the default sysroot for e.g. clang, so we're
not tackling that yet.

See also WebAssembly#447 for further details.

Signed-off-by: Joel Dice <[email protected]>
sunfishcode pushed a commit that referenced this pull request Dec 22, 2023
Currently, this is identical to the `wasm32-wasi` in all but name.  See #449 for
the next step, which is to incrementally add Preview 2 features,
e.g. `wasi-sockets`.  Per the discussion in that PR, I've split the
`wasi-sysroot/include` directory into per-target directories.  Eventually, we'll
want to build a separate sysroot for each target, but there's currently
uncertainty about how to configure the default sysroot for e.g. clang, so we're
not tackling that yet.

See also #447 for further details.

Signed-off-by: Joel Dice <[email protected]>
@yamt
Copy link
Contributor

yamt commented Jan 23, 2024

do you have any idea how these stuff would work when we add multi thread support later?
adapting code with bold single thread assumptions is a heavy task. it's better to do it "properly" from the start.

@@ -1,2 +1,3 @@
sysroot
build
.vscode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put editor-specific paths in your own global git ignore file, rather than in repository .gitignore files.

@@ -0,0 +1,173 @@
// SPDX-License-Identifier: BSD-2-Clause
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are files like this one derived from cloudlibc, or are they new? We should ideally put new code somewhere outside of the cloudlibc directory, to keep it clear when we're using and modifying upstream code which may have upstream assumptions, and when we're writing our own code.

typedef struct {} tcp_socket_state_unbound_t;
typedef struct {} tcp_socket_state_bound_t;
typedef struct {} tcp_socket_state_connecting_t;
typedef struct {} tcp_socket_state_listening_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this context, at this time: if clang is ok with it, it's fine.

@@ -0,0 +1,2481 @@
// Generated by `wit-bindgen` 0.16.0. DO NOT EDIT!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, #include <wasi/api.h> is public interface for preview1; we'd break users if we renamed it.

@badeend
Copy link
Contributor

badeend commented Jan 29, 2024

do you have any idea how these stuff would work when we add multi thread support later? adapting code with bold single thread assumptions is a heavy task. it's better to do it "properly" from the start.

The component model does not support multi-threading across component boundaries, and probably won't be doing so for a while. I agree it wouldn't hurt to have at least a rough idea before leaning too much into it, but I don't think it's worth blocking this PR on.

As for such a rough idea: add a mutex per file descriptor and perform everything behind that lock. For sockets specifically; state changes happen only during socket initialization. They spend the bulk of the time in their end states (TCP_SOCKET_STATE_CONNECTED or TCP_SOCKET_STATE_LISTENING). So there might be optimizations available where we don't need to lock at all in those states.

@ghost
Copy link

ghost commented Feb 2, 2024

The Alpine Linux wasi-sdk package literally just reexports a few packages and adds a .cfg file that Clang uses in order for --target=... to work properly (without the need to specify --sysroot=... explicitly). Could a similar approach be used here too? I.e. having the Clang in wasi-sdk tarball read this a .cfg relative to the extracted directory. (It seems to use the target triple to resolve this .cfg file, so it should be possible.)

More concretely, I think having something like this should work:

  • wasi-sdk/etc/clangXX/wasm32-unknown-wasi.cfg
  • wasi-sdk/etc/clangXX/wasm32-unknown-wasi-preview2.cfg

I guess in Alpine, they can specify the an absolute --sysroot=... file name, which would not be possible in this case. 🤔 Hmm.

--rename wasi:cli/[email protected]=terminal_stderr \
./wit
mv imports.h libc-bottom-half/headers/private/preview2.h
sed 's/#include "imports.h"/#include "preview2.h"/' < imports.c > libc-bottom-half/cloudlibc/src/libc/sys/wasi_preview2/preview2.c
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very small point: The . in the regex for sed matches any character (rather than a single dot). It might be sensible to escape it:

- sed 's/#include "imports.h"/#include "preview2.h"/' < ...
+ sed 's/#include "imports\.h"/#include "preview2.h"/' < ...

$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^
$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $^
ifeq ($(WASI_SNAPSHOT), preview2)
$(WASM_TOOLS) component new --adapt $(ADAPTER) $@ -o $@
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not very good practice. If wasm-tools fails here (for whatever reason), the Wasm file will not be changed, but it will be considered up to date for make, so this rule will not run on the next invocation to make. Ideally, you’d have something like a temporary intermediate file, so that the input and output to wasm-tools is a different file.

Something like this:

$(CC) $(CFLAGS) $(LDFLAGS) -o tmp.wasm $^
$(WASM_TOOLS) component new --adapt $(ADAPTER) tmp.wasm -o $@
$(RM) tmp.wasm

@dicej
Copy link
Contributor Author

dicej commented Feb 12, 2024

Thanks to everyone who has provided feedback on this PR.

FYI, I am planning to address that feedback, but not necessarily in this PR, since @sbc100 has requested that I break it up into smaller PRs, which I agree is a good idea, and I've been addressing any relevant feedback in each of those PRs. The quicker we can get those reviewed, the quicker we'll get everything merged.

Speaking of which: this one could use some attention: #464

I'm planning to open another small one today.

@dicej
Copy link
Contributor Author

dicej commented Apr 3, 2024

Everything that was in this PR has now been merged into main via a series of smaller PRs, so we can close this.

@dicej dicej closed this Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants