Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebAssembly externalref and tables #61

Open
JohnDog3112 opened this issue Jan 8, 2025 · 2 comments
Open

WebAssembly externalref and tables #61

JohnDog3112 opened this issue Jan 8, 2025 · 2 comments

Comments

@JohnDog3112
Copy link
Collaborator

JohnDog3112 commented Jan 8, 2025

After discovering webassembly tables after looking into more methods to use for #30 , I discovered some more features that might be useful for this project. For starters, I mentioned how it could allow callbacks to be almost directly passed to JS here: #55 . Additionally, it can be used to store and pass around what WebAssembly calls externrefs. These, in conjunction with tables, allow a WASM module to interact with JS objects to some level. However, there are some limitations and workarounds needed for both.

Tables

The basic WebAssembly documentation that started this off is here. However, I found it somewhat lacking and had to search through multiple sources to find everything I needed. Here's a general overview:

  • Tables can be created both through JS and through WASM
  • Tables can be shared and mutated from either JS or WASM
  • There are currently two types of tables. funcref and anyref.
    • funcref tables store function pointers/references. It doesn't seem to work for JS functions (though I haven't done much testing on those lines), but it does allow WASM to insert function references/pointers into it at runtime. These references can then be called from JS in a similar fashion to if they were exported.
      • When passing a function to JS, it is usually given an ID of it's index in a shared table. So the JS function receiving it needs to retrieve it from the table using the given ID to call the function.
    • anyref tables store generic JS objects. I don't believe they can store WASM objects, but it does allow for a WASM function to store JS objects to a global table. This table is part of the JS runtime, so it stops any objects within from being garbage collected.
  • tables start at a fixed size and can be grown later user the .grow() function.

While WASM can do all of the above things, the current implementation in C through clang is a bit more limited. For reference, the list below was mostly derived from this.

  • For now, it seems like the only table that can be exported or imported from C is the __indirect_function_table:
    • This table is automatically created and populated when you either return a function or you pass it as an argument to a JS function. So, when the function is actually called, C will pass the index of the table it was inserted in at compile time to the JS function. Not quite sure what happens if it can't tell where the function pointer is at compile time.
    • Have yet to find a way to access this table from within C outside of the above interactions.
    • Adding the --export-table or --import-table flags to wasm-ld when linking will export or import the table from JS under the name __indirect_function_table.
    • So, if a function receives a function as an argument, it can take the id and do __indirect_function_table.get(functionID)(..args)
  • You can create a new table within C (that can't be exported or imported) with the following code: static __externref_t ext[0]; However, there are some stipulations:
    • This table is automatically an anyref type. I couldn't, as of right now, find any way to make a new funcref table outside the automatically generated on above.
    • Tables must be static, can't be extern, and can't be defined inside of functions.
    • Additionally, they are sizeless and must always be defined with a length of 0.
    • Pointers to them are not allowed and they can't be passed as function arguments or return values
      • this also means they can't be inserted into something like a struct
    • they can be interacted with using the following functions:
      • ref __builtin_wasm_table_get(table, idx)
      • void __builtin_wasm_table_set(table, idx, ref)
      • uint __builtin_wasm_table_size(table)
      • uint __builtin_wasm_table_grow(table, ref, uint)
      • void __builtin_wasm_table_fill(table, idx, ref, uint)
      • void __builtin_wasm_table_copy(table, table, uint, uint, uint)
    • if you wanted to get around some of the limitations above, you could created wrappers around the above functions that just reference the table directly and pass those functions around. e.x. wasm_table_one_set(idx, ref) { __builtin_wasm_table_set(table_one, id, ref); }

externref

I couldn't find a lot of documentation around this either, but it was mentioned near the bottom here and the C/clang implementation here. The closest description I can get is that it's a pointer for JS objects. The main usages and limitations are as follows:

  • C uses the type __externref_t to denote an external reference
  • __externref_t is sizeless (so can't be stored in things like structs)
  • Supposedly, it can be declared as a global, but I get a long llvm/clang error that I don't quite understand.
  • It can be taken as a function argument or be returned
  • It can be inserted/retrieved from a WASM table
  • The only way, that I can find, to store it is through a WASM table.

Uses

So, between WASM tables and externrefs there's a lot of limitations to keep in mind. However, I think it would make several things easier if they were to be integrated into the code base.

  • As mentioned in Alternative Callback Method #55 it could allow for a more convenient/easier method of passing callbacks using the indirect_function_table.
  • It could simplify object management in libraries:
    • Libraries currently have to keep track of JS objects that they want WASM to have access to and then have them be provided by ID
    • With externref's + tables, the WASM object could hold the objects and pass them around itself without the library needing to keep track of it.
    • This could be done with something similar to malloc where when provided an __external_ref you could provide it to the function and it provides a pointer/id to it's position on a global table.
    • While in the backend, it would still use an id system similar to what is done now, it would be global to the WASM module rather than for every library/console.
    • This would also make it easier to pass objects between libraries. For instance, you could have a library to initialize/load an image from the internet and then pass that object (without any intermediary steps) to the d2dcanvas class to draw it. Or, for instance, you could pass that same image to an html library to put it on the webpage.
    • Additionally, it would make it easier to both share and separate JS objects between modules:
      • Currently, if a console/library wants to separate JS objects so another module can't interact with them, it needs to include something like the WASM module ID to separate it. This separates them, but it has to implemented for every module that wants to use this and would require more code to allow things to be selectively shared.
      • If every module stores the JS objects itself, they are automatically seperated
      • In addition, you could then create functions to pass objects between modules (or the module writers could implement them themselves) and it would work for all JS objects, not just ones from specific libraries.
  • They could be used to make consoles more dynamic:
    • Right now, consoles are accessed and registered on a global scope by ID within a module.
    • Alternatively, consoles could be bound to a WASM table as an externref and WASM could directly pass the console object when calling functions.
      • Not sure if it would be worth it though depending on how much code it breaks or if it introduces other problems that aren't worth the hassle.
    • This would allow the following:
      1. It would allow some consoles to be inaccessible by others (though, that's only a problem if someone is manually inserting console ID's)
      2. If consoles were allowed to be registered locally (as in not in the global registry), then a WASM module could have and create consoles directly bound to it. In this way, if the WASM module is closed or if it drops an item from the table, JS can automatically garbage collect it.

Examples

Here's some example/prototype code for using WASM tables.

The following is a malloc like implementation. In a real implementation, malloc_ref and free_ref should fill in empty/freed spaces before growing the table. The main disadvantage to something like this is you would have to convert __externref_t to externref_pntr if you wanted to store a value and then later convert it back to object_ref via get_ref to return it back to JS.

//declare WASM table
static __externref_t object_refs[0];
//struct for storing an ID/index in the above table
struct externref_pntr {
    int id;
};
//malloc equivalent, inserts object into table and returns index
struct externref_pntr malloc_ref(__externref_t ref) {
    struct externref_pntr pntr = {
        //index is the current size before adding a new object
        .id = __builtin_wasm_table_size(object_refs),
    };
   //add ref as a new object at the end of the table 
    __builtin_wasm_table_grow(object_refs, ref, 1);
    return pntr;
}
//free equivalent, simply sets the table index to NULL so JS can garbage collect the object
void free_ref(struct externref_pntr pntr) {
   //__builtin_wasm_ref_null_extern() simply returns a NULL object ref
    __builtin_wasm_table_set(object_refs, pntr.id, __builtin_wasm_ref_null_extern());
}
//Function to dereference/get a reference pointer
__externref_t get_ref(struct externref_pntr* pntr) {
    //simply retrieves it from the table
    return __builtin_wasm_table_get(object_refs, pntr->id);
}

Example code for using the indirect_function_table can be found here

Importing/Exporting tables

As I said above, I couldn't find a way to get clang to export/import tables outside of __indirect_function_table, however, it can be done directly through WASM. So if desired, one could import a WASM module written in something like WAT that imports/exports custom tables and use it in the main library.

For example, the following code is what defines and exports a WASM table:

;; define a WASM table of type funcref with initial size 8 and ID 1
(table (;1;) 8 funcref)
;; export table 1 (the one defined above) under the name "__indirect_function_table"
(export "__indirect_function_table" (table 1))
;; insert the list of functions into table 1 starting at index 1
(elem (;0;) (table 1) (i32.const 1) func $ten $twelve $thirteen $add $sub $divide $multiply)

The above code was generated using clang, but it could, potentially, be written to create a secondary funcref table with exported functions for adding and removing items from it. It could also be used to export/import secondary WASM tables so that both JS and C could access them.

I think the code also demonstrates some of the limitations of tables in how they can't be passed around. Mainly that it seems to take a set number, not a variable, for referencing the table and therefor the table number must be known at compile time.

@JohnDog3112
Copy link
Collaborator Author

I've done a bit more research on funcref tables and have found how they are intended to be supported. Some documentation seems to imply that it already is, however I get llvm bugs whenever I try to duplicate their syntax. The issue here mentions adding a type called __funcref that will make a function pointer as a funcref. However, when I try to make a table with it (in place of __externref_t), it just gives me errors. Or, in the few cases where it doesn't, I can't use the table functions since it gives errors saying it's not a WebAssembly table. This is in spite of this documentation using it as an example:

typedef void (*__funcref funcref_t)();
static __funcref table[0];

size_t getSize() {
  return __builtin_wasm_table_size(table);
}

I don't think it has anything to do with the version of clang since I have tested it with both clang version 17.0.6 and 19.1.5.

@JohnDog3112
Copy link
Collaborator Author

I still haven't been able to find anything conclusive on importing/exporting custom tables directly through clang. I've found this discussion where they say:

To manage the transitional period in which the compiler doesn't yet produce TABLE_NUMBER relocations and doesn't residualize table symbols, the linker will detect object files which have table imports or definitions, but no table symbols. In that case it will synthesize symbols for the defined and imported tables.

Which makes it sound like there's support for importing/exporting, but everything else seems to imply it only applies to the __indirect_function_table. So, for now, I'm just going to conclude that clang can't handle the importing/exporting of tables outside of the __indirect_function_table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant