-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DO NOT MERGE: WebAssembly standalone support #103
base: master
Are you sure you want to change the base?
Conversation
… a dummy version in the runtime
Hi @jerbob92, Thank you very much for sharing this! I didn't expect that we'd have to patch emscriptem and add a new features PDF. I'll try to cherry-pick the changes you recommend right away. Best regards, |
I think Emscripten is not going to accept the patch in this state because it only implements what PDFium uses, nothing else. And regarding extending PDFium: I don't think there is a way to work around this, to use function pointers in WebAssembly, the pointer needs to exist during compile time or you need to add the pointer to a method during runtime, which many runtimes don't support, that's why I had to implement various callbacks in the PDFium patch and then implement the callback logic on the runtime side. So I wouldn't really see it as extending PDFium, this is just a way to make all functionality work on in WebAssembly.
I think this was related to the memory allocator that has since been disabled. |
@bblanchon small warning, |
@bblanchon I have been working on making an embind implementation for Go/Wazero here: https://github.com/jerbob92/wazero-emscripten-embind Would you consider merging this If I convert the changes from |
My main concern with this PR is that it not only patches PDFium, but it also patches emscriptem. I am concerned that this build will fail often and require frequent maintenance. I'm okay with adding a standalone build or modifying the existing wasm build to support both, but I don't want this project to be a maintenance nightmare. |
Understandable, I was not talking about the emscripten patch though, I'm still working on getting that merged into Emscripten, but the current discussion there is whether they are just going to include wasi-libc into Emscripten (most of the stuff in my Emscripten patch comes from wasi-libc). I was specifically asking about the embind integration, which is a patch I'm currently working on (in favor of the callbacks patch). |
Honestly, I'm not a big fan of this callback patch, probably because I don't understand it. |
I'm starting to feel like I'm the only one using the WebAssembly build :D I will try to explain:
|
There are at least two. I have two GitHub pages hosted right now using these PDFium WebAssembly builds: https://www.sungaila.de/PDFtoImage/ and https://www.sungaila.de/PDFtoZPL/ My biggest issue right now is the WASM NuGet package. I need specific builds with specific Emscripten versions and the NuGet package would need to include one build per version. Each .NET release depends on a single Emscripten release. I didn't find a way an easy way to modify the existing GitHub workflows to make this happen (without rewritting most bash scripts and workflow YMLs). So instead I use my fork to build with user-defined Emscripten version: https://github.com/sungaila/pdfium-binaries/blob/3213c3f882bd75891f7da8b9644111708a3fa370/.github/workflows/build-one.yml#L44C7-L48C26 |
Ah cool! But is there anyone using the JS WASM version? It looks like the .NET runtime is already doing a lot for you in terms of data conversion and function pointer handling. So basically it looks like they built the Embind thing inside the runtime.
Honestly, this sounds like a major design flaw in the .NET WASM support? But seeing how they integrate the library (with the .a file and not the .wasm file), it makes sense that it works like this. |
They transpile the .NET code into C++ and then into Wasm. Might explain why they want .a files but I still despise that Emscripten SDK version lock-in. |
Yeah it's probably not something they like to do, but it works like that because they include the .a file in the compilation process, which does allow them to use the WebAssembly more freely, but comes with the downside that the compiled .a file can only be used by that Emscripten version, since the .a file is not a build result, but a intermediary file during the build process to WebAssembly. I don't see that changing without .NET changing the way how it integrates the WebAssembly. |
@jerbob92, I didn't realize that Embind would allow you to get rid of |
I have started implementing this but it's starting to look like I can't export everything from pdfium directly using Embind because it doesn't support everything, for example incomplete types, which most types are, and pointers to native types, so I have been hitting some roadblocks. I'm currently in contact with Emscripten to see how we could implement this, but if it requires wrapping every function with an Embind function this whole endeavor seems kinda pointless. I'll keep you updated. |
@bblanchon I promised to come back to you when I finished my WebAssembly implementation, so here it is! It's not ready to merge this in, since it will break support for the web build, but it is possible to use this as a base to create 2 WASM builds: web and standalone.
This contains the following changes:
The thing that I'm not really happy with is how the Emscripten patch is applied, but I couldn't get it to work in another way, since
emsdk/upstream/emscripten
is not a git repo.