Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get Jouni's GBZ reader into WebAssembly and demo it as a tube map backend #379

Closed
adamnovak opened this issue Dec 5, 2023 · 5 comments
Closed
Assignees

Comments

@adamnovak
Copy link
Member

This is the other half of #367, or the other third if we count implementing the JS wrapper.

We need to be able to build https://github.com/jltsiren/gbz-base for WebAssembly and get the result into the SequenceTubeMap build process.

@adamnovak adamnovak self-assigned this Dec 5, 2023
@adamnovak adamnovak converted this from a draft issue Dec 5, 2023
@adamnovak
Copy link
Member Author

OK, I've looked at this a bit.

On Mac, you need to use Homebrew to install rustup-init, which then installs Rust with rustup:

brew install rustup-init
rustup-init
. ~/.cargo/env

Then you need to use rustup to install the WebAssembly cross-compiling Rust standard library. It looks like the Emscripten-based one is more or less dead, and the generic one has no C standard library for sqlite, so we probably want to try and target WASI:

rustup target add wasm32-wasi

Then we can get the code:

git clone https://github.com/jltsiren/gbz-base
cd gbz-base

For rusqlite, we need to go get a C compiler for WASI. In theory Clang can get away with just headers and a standard library blob, but GCC probably can't.

wget https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz
tar -xvf wasi-sdk-20.0-macos.tar.gz
rm wasi-sdk-20.0-macos.tar.gz
export CC_wasm32_wasi=`pwd`/wasi-sdk-20.0/bin/clang

Apparently there's a wasm32-wasi-vfs feature on rusqlite, but I don't seem to need it yet.

With that set we can cargo build --release --target=wasm32-wasi and it will build sqlite, but it fails with other problems around needing file descriptors and memory mapping in simple-sds. We need to be able to build simple-sds without its memory-mapping code for this to work. So that is probably subtask 1.

@adamnovak
Copy link
Member Author

I think I have a fix for simple-sds. But my resulting binaries don't run because of a missing builtin implementation: WebAssembly/wasi-sdk#361

@adamnovak
Copy link
Member Author

OK, with 486ac7bf140a1cc8dcc4de86dbbc3e7439e3b6e0 which uses my wasm-buildable simple-sds at adamnovak/simple-sds@8c4736f, I can build with:

wget https://github.com/WebAssembly/wasi-sdk/releases/download/wasi-sdk-20/wasi-sdk-20.0-macos.tar.gz
tar -xvf wasi-sdk-20.0-macos.tar.gz
rm wasi-sdk-20.0-macos.tar.gz

CC_wasm32_wasi=`pwd`/wasi-sdk-20.0/bin/clang LIBSQLITE3_FLAGS="-DLONGDOUBLE_TYPE=double -DSQLITE_THREADSAFE=0" cargo build --release --target=wasm32-wasi

Then (assuming I also build for the host), I can make a database:

wget https://github.com/vgteam/sequenceTubeMap/raw/e70b93c291fd308e1ad718ef4104a9865214b046/exampleData/x.gbz
target/release/gbz2db --overwrite x.gbz

And with a WASM runner that supports WASI (brew install wasmtime) I can access the database:

wasmtime --dir . target/wasm32-wasi/release/query.wasm --sample "_gbwt_ref" --contig x --interval 1..10 --context 100 --distinct x.gbz.db

H	VN:Z:1.1	RS:Z:_gbwt_ref
S	1	CTTATTTG
S	2	T
S	3	C
S	4	A
...

I get what looks to be the right GFA file out.

@jltsiren is right that there is trouble with usize/u64 for reading GBZ itself though. Even with this tiny GBZ that doesn't have more than 4 billion of anything, the WASM build of the importer can't read it:

wasmtime --dir . target/wasm32-wasi/release/gbz2db.wasm x.gbz -o x.wasm.gbz.db

Loading GBZ graph x.gbz
Error: "Bit length / word length mismatch"

@adamnovak adamnovak moved this from Todo to In Progress in Tube Map/Lancet Integration Dec 12, 2023
@adamnovak
Copy link
Member Author

I had wanted to use the wasm-bindgen crate, which can bind Rust classes over to JS so you can use them there: https://rustwasm.github.io/docs/wasm-bindgen/reference/attributes/on-rust-exports/constructor.html

Unfortunately, you can't have it in the same project as rusqlite, because rusqlite can only build for wasm32-wasi and wasm-bindgen can only build for wasm32-unknown-unknown: rusqlite/rusqlite#827 (comment)

Also, wasm-bindgen turns out whole ES modules with some JS that you import, whereas to use rusqlite we'd need to present a filesystem we control to the WASI syscall implementations form whatever WASI shim we use, meaning we'd need more control over the WASM load step than it seems like you get with wasm-bindgen.

So I think I'm going to have to write the core local implementation of each server-side function in Rust, and then export it as a C ABI function from the Rust code, which will be visible as a WASM export that JS can find and call. That should work, though I might need to so some !!fun!! things for strings?

@adamnovak adamnovak changed the title Get Jouni's GBZ reader into WebAssembly Get Jouni's GBZ reader into WebAssembly and demo it as a tube map backend Jan 24, 2024
@adamnovak adamnovak moved this from In Progress to Done in Tube Map/Lancet Integration Feb 14, 2024
@adamnovak
Copy link
Member Author

I have this working now in 9a7d5ff. I'm using a WASI build and just invoking the CLI query command and getting standard output, which it conveniently fills with JSON I can understand.

I'm running the WASM in a web worker, and fetching bits of the data it needs synchronously with FileReaderSync against slices of a Blob sent over from the page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

1 participant