Skip to content

Implement open/stat/mkdir/symlink for standalone WASI mode #24246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
c606386
Implement `open`/`stat`/`mkdir`/`symlink` for standalone WASI mode
jiixyj May 1, 2025
46b7fa4
remove unneeded list comprehension
jiixyj May 2, 2025
16b1925
skip standalone WASM tests when no engines are configured
jiixyj May 2, 2025
e12df0f
update wasmtime version in CI
jiixyj May 2, 2025
65bec88
try to fix 'test_time' and 'test_console_out' tests
jiixyj May 2, 2025
ca3f795
add stubbed out path_filestat_get to libwasi.js
jiixyj May 3, 2025
52a4e03
do preopen handling only in PURE_WASI mode
jiixyj May 3, 2025
53c309b
add path_filestat_get__nothrow: true
jiixyj May 3, 2025
65e8c96
run standalone WASM tests in a subdirectory of 'out/test' for each en…
jiixyj May 3, 2025
5303f12
try to fix CI
jiixyj May 3, 2025
95eab07
refactor to avoid 'allow_wasm_engines' parameter
jiixyj May 3, 2025
bd4a07c
remove hack by adding a stubbed out path_symlink to libwasi.js
jiixyj May 3, 2025
73e982e
Merge remote-tracking branch 'origin/main' into fcntl-open-test
jiixyj May 4, 2025
36e66d8
add attribution to wasi-libc
jiixyj May 4, 2025
cc87f5d
should be wasm_engines instead of self.wasm_engines here
jiixyj May 4, 2025
67905b1
add license headers and use pragma once for consistency
jiixyj May 9, 2025
88650d8
Merge remote-tracking branch 'origin/main' into fcntl-open-test
jiixyj May 9, 2025
f3a80bb
use double slash comments and braces for if statements consistently
jiixyj May 9, 2025
534223f
wrap new libwasi.js stubs in ALLOW_UNIMPLEMENTED_SYSCALLS
jiixyj May 9, 2025
16d2ba7
default 'exclude_engines' in argument list since we don't mutate it
jiixyj May 9, 2025
eb35b6b
simplify calculation of final 'wasm_engines'
jiixyj May 9, 2025
1ec092a
add comment explaining 'exclude_engines' parameter
jiixyj May 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -494,7 +494,7 @@ jobs:
name: get wasmtime
command: |
# use a pinned version due to https://github.com/bytecodealliance/wasmtime/issues/714
export VERSION=v0.33.0
export VERSION=v32.0.0
wget https://github.com/bytecodealliance/wasmtime/releases/download/$VERSION/wasmtime-$VERSION-x86_64-linux.tar.xz
tar -xf wasmtime-$VERSION-x86_64-linux.tar.xz
cp wasmtime-$VERSION-x86_64-linux/wasmtime ~/vms
Expand Down
28 changes: 28 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -100,3 +100,31 @@ The third_party/ subdirectory contains code with other licenses. None of it is
used by default, but certain options use it (e.g., the optional closure compiler
flag will run closure compiler from third_party/).

Files in system/lib/standalone/ contain code derived from wasi-libc, in
accordance with the terms of the MIT license. wasi-libc's license follows:

"""
Permission is hereby granted, free of charge, to any
person obtaining a copy of this software and associated
documentation files (the "Software"), to deal in the
Software without restriction, including without
limitation the rights to use, copy, modify, merge,
publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software
is furnished to do so, subject to the following
conditions:

The above copyright notice and this permission notice
shall be included in all copies or substantial portions
of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF
ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR
IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.
"""
3 changes: 3 additions & 0 deletions src/lib/libsigs.js
Original file line number Diff line number Diff line change
Expand Up @@ -1536,6 +1536,9 @@ sigs = {
lineColor__sig: 'ipiiiii',
lineRGBA__sig: 'ipiiiiiiii',
llvm_eh_typeid_for__sig: 'vp',
path_create_directory__sig: 'iipp',
path_filestat_get__sig: 'iiippp',
path_symlink__sig: 'ippipp',
pixelRGBA__sig: 'ipiiiiii',
proc_exit__sig: 'vi',
random_get__sig: 'ipp',
Expand Down
17 changes: 17 additions & 0 deletions src/lib/libwasi.js
Original file line number Diff line number Diff line change
Expand Up @@ -608,6 +608,23 @@ var WasiLibrary = {
randomFill(HEAPU8.subarray(buffer, buffer + size));
return 0;
},

#if ALLOW_UNIMPLEMENTED_SYSCALLS
path_filestat_get__nothrow: true,
path_filestat_get: (fd, flags, path, path_len, buf) => {
return {{{ cDefs.ENOSYS }}};
},

path_create_directory__nothrow: true,
path_create_directory: (fd, path, path_len) => {
return {{{ cDefs.ENOSYS }}};
},

path_symlink__nothrow: true,
path_symlink: (old_path, old_path_len, fd, new_path, new_path_len) => {
return {{{ cDefs.ENOSYS }}};
},
#endif
};

for (var x in WasiLibrary) {
Expand Down
254 changes: 254 additions & 0 deletions system/lib/standalone/paths.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,254 @@
/*
* Copyright 2025 The Emscripten Authors. All rights reserved.
* Emscripten is available under two separate licenses, the MIT license and the
* University of Illinois/NCSA Open Source License. Both these licenses can be
* found in the LICENSE file.
*
* The preopen code is based on wasi-libc's `preopens.c` which is licensed
* under a MIT style license. This license can also be found in the LICENSE
* file.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the copyright line from the original file, in addition to this text (which is good).

*/

#define _GNU_SOURCE
#include "paths.h"

#include <assert.h>
#include <errno.h>
#include <fcntl.h>
#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
#include <sysexits.h>

// A name and file descriptor pair.
typedef struct preopen {
// The path prefix associated with the file descriptor.
const char* prefix;

// The file descriptor.
__wasi_fd_t fd;
} preopen;

// A simple growable array of `preopen`.
static preopen* preopens;
static size_t num_preopens;

// Are the `prefix_len` bytes pointed to by `prefix` a prefix of `path`?
static bool
prefix_matches(const char* prefix, size_t prefix_len, const char* path) {
// Allow an empty string as a prefix of any relative path.
if (path[0] != '/' && prefix_len == 0) {
return true;
}

// Check whether any bytes of the prefix differ.
if (memcmp(path, prefix, prefix_len) != 0) {
return false;
}

// Ignore trailing slashes in directory names.
size_t i = prefix_len;
while (i > 0 && prefix[i - 1] == '/') {
--i;
}

// Match only complete path components.
char last = path[i];
return last == '/' || last == '\0';
}

bool __paths_resolve_path(int* resolved_dirfd, const char** path_ptr) {
const char* path = *path_ptr;

if (*resolved_dirfd != AT_FDCWD && path[0] != '/') {
return true;
}

// Strip leading `/` characters, the prefixes we're mataching won't have
// them.
while (*path == '/') {
path++;
}
// Search through the preopens table. Iterate in reverse so that more
// recently added preopens take precedence over less recently addded ones.
size_t match_len = 0;
int fd = -1;
for (size_t i = num_preopens; i > 0; --i) {
const preopen* pre = &preopens[i - 1];
const char* prefix = pre->prefix;
size_t len = strlen(prefix);

// If we haven't had a match yet, or the candidate path is longer than
// our current best match's path, and the candidate path is a prefix of
// the requested path, take that as the new best path.
if ((fd == -1 || len > match_len) && prefix_matches(prefix, len, path)) {
fd = pre->fd;
match_len = len;
}
}

if (fd == -1) {
return false;
}

// The relative path is the substring after the portion that was matched.
const char* computed = path + match_len;

// Omit leading slashes in the relative path.
while (*computed == '/') {
++computed;
}

// *at syscalls don't accept empty relative paths, so use "." instead.
if (*computed == '\0') {
computed = ".";
}

*resolved_dirfd = fd;
*path_ptr = computed;
return true;
}

#if defined(EMSCRIPTEN_PURE_WASI)

static size_t preopen_capacity;

#ifdef NDEBUG
#define assert_invariants() // assertions disabled
#else
static void assert_invariants(void) {
assert(num_preopens <= preopen_capacity);
assert(preopen_capacity == 0 || preopens != NULL);
assert(preopen_capacity == 0 ||
preopen_capacity * sizeof(preopen) > preopen_capacity);

for (size_t i = 0; i < num_preopens; ++i) {
const preopen* pre = &preopens[i];
assert(pre->prefix != NULL);
assert(pre->fd != (__wasi_fd_t)-1);
#ifdef __wasm__
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ifdef __wasm__ here? IIUC this code only compiles and run on wasm

assert((uintptr_t)pre->prefix <
(__uint128_t)__builtin_wasm_memory_size(0) * PAGESIZE);
#endif
}
}
#endif

// Allocate space for more preopens. Returns 0 on success and -1 on failure.
static bool resize_preopens(void) {
size_t start_capacity = 4;
size_t old_capacity = preopen_capacity;
size_t new_capacity = old_capacity == 0 ? start_capacity : old_capacity * 2;

preopen* old_preopens = preopens;
preopen* new_preopens = calloc(sizeof(preopen), new_capacity);
if (new_preopens == NULL) {
return false;
}

memcpy(new_preopens, old_preopens, num_preopens * sizeof(preopen));
preopens = new_preopens;
preopen_capacity = new_capacity;
free(old_preopens);

assert_invariants();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just preopens = realloc(preopen, new_capacity) here?

return true;
}

// Normalize an absolute path. Removes leading `/` and leading `./`, so the
// first character is the start of a directory name. This works because our
// process always starts with a working directory of `/`. Additionally translate
// `.` to the empty string.
static const char* strip_prefixes(const char* path) {
while (1) {
if (path[0] == '/') {
path++;
} else if (path[0] == '.' && path[1] == '/') {
path += 2;
} else if (path[0] == '.' && path[1] == 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use '\0' here for the terminator?

This case looks like its trying to match . on its own and then return the empty string.. is that right?

path++;
} else {
break;
}
}

return path;
}

// Register the given preopened file descriptor under the given path.
//
// This function takes ownership of `prefix`.
static bool register_preopened_fd(__wasi_fd_t fd, const char* relprefix) {
// Check preconditions.
assert_invariants();
assert(fd != AT_FDCWD);
assert(fd != -1);
assert(relprefix != NULL);

if (num_preopens == preopen_capacity && !resize_preopens()) {
return false;
}

char* prefix = strdup(strip_prefixes(relprefix));
if (prefix == NULL) {
return false;
}
preopens[num_preopens++] = (preopen){
prefix,
fd,
};

assert_invariants();
return true;
}

// Populate WASI preopens.
__attribute__((constructor(100))) // construct this before user code
static void _standalone_populate_preopens(void) {
// Skip stdin, stdout, and stderr, and count up until we reach an invalid
// file descriptor.
for (__wasi_fd_t fd = 3; fd != 0; ++fd) {
__wasi_prestat_t prestat;
__wasi_errno_t ret = __wasi_fd_prestat_get(fd, &prestat);
if (ret == __WASI_ERRNO_BADF) {
break;
}
if (ret != __WASI_ERRNO_SUCCESS) {
goto oserr;
}
switch (prestat.pr_type) {
case __WASI_PREOPENTYPE_DIR: {
char* prefix = malloc(prestat.u.dir.pr_name_len + 1);
if (prefix == NULL) {
goto software;
}

// TODO: Remove the cast on `path` once the witx is updated with
// char8 support.
ret = __wasi_fd_prestat_dir_name(
fd, (uint8_t*)prefix, prestat.u.dir.pr_name_len);
if (ret != __WASI_ERRNO_SUCCESS) {
goto oserr;
}
prefix[prestat.u.dir.pr_name_len] = '\0';

if (!register_preopened_fd(fd, prefix)) {
goto software;
}
free(prefix);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that register_preopened_fd is going to store and use the prefix, why not pass owership? i.e. don't call free() here or strdup in register_preopened_fd?


break;
}
default:
break;
}
}

return;
oserr:
_Exit(EX_OSERR);
software:
_Exit(EX_SOFTWARE);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just abort() above instead of these goto + exit stuff?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question -- this seems to be a wasi-libc thing. The code in __main_void (also derived from wasi-libc) uses EX_OSERR as well:

__wasi_proc_exit(EX_OSERR);

}

#endif
25 changes: 25 additions & 0 deletions system/lib/standalone/paths.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
/*
* Copyright 2025 The Emscripten Authors. All rights reserved.
* Emscripten is available under two separate licenses, the MIT license and the
* University of Illinois/NCSA Open Source License. Both these licenses can be
* found in the LICENSE file.
*/

#pragma once

#include <stdbool.h>

//
// Resolve a (dirfd, relative/absolute path) pair.
//
// Arguments:
// - `resolved_dirfd`:
// - as input: input dirfd, may be `AT_FDCWD`
// - as output: resolved dirfd (which always is a preopened fd)
// - `path_ptr`:
// - as input: pointer to a relative or absolute path
// - as output: a path relative to `resolved_dirfd`
//
// Returns: `true` if resolution was successful, `false` otherwise.
//
bool __paths_resolve_path(int* resolved_dirfd, const char** path_ptr);
Loading