-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental wasmtime/redpanda integration #1
Comments
Nice! 😎 Thanks for sharing!
Yeah I think there is a tradeoff here and I'm not sure what the right one is (maybe the answer is it's configurable...). I think if a developer uploads a function that has a bug and never terminates we probably want to eventually stop it and report an error? I think error handling needs a lot of thought of what the right behavior here is. (How do you report this back to developers? Do you stop processing new work? Etc).
Yeah the documentation states that fuel can be up to 2x-3x slower but I didn't see that and avoiding an alien thread to update the epoch timer seems like a win!
You and me both! It is helpful. I saw that you enabled WASI - do you have any thoughts on WASI in redpanda? Allowing access to network and disk seems like it could be powerful, but I think we'd need to build our own WASI layer that is built ontop of seastar disk + network primitives. The defaults for wasmtime I believe using blocking syscalls. |
Current approach is to run a function for every record batch https://docs.redpanda.com/docs/labs/data-transform/ so yes. There should be a limit. But the limit should probably be expressed in seconds, not in cycles. Alternative approach could be to run a loop in WASM. This loop could fetch batches, transform them and then produce. In this case you can run a WASM function per partition and you can implement something like a
I didn't have a chance to expose any host functions to WASM yet. But my preliminary research suggested that WASI is a way to do this. |
💯 This is a great point - I agree.
I've been thinking about this in terms of an event based model vs a sidecar model. The event based model is what is implemented now and the sidecar model is what you're describing here. I think I prefer the event model at first take - it gives Redpanda more control over the lifecycle and when things happen. So I think this makes error handling more straightforward and the API contract more clear without being overly restrictive. I do like the
I think there is a subset of WASI we'll want to expose. Logging to stdout/stderr, environment variables (for configuration) and clocks will all be useful to take advantage of upstream tooling. I'm not quite sure about other features of it like I/O. I think we'll want to expose a Redpanda specific interface for I/O if any. |
Hi,
I did something similar with redpanda the other day. The approach that I took with consume fuel was to reset the fuel back after every yield. This way the warm function could potentially run indefinitely. I also tried to use epoch termination and found that there is little difference in terms of performance. And it's difficult to use with seastar/redpanda because it requires external thread that will update the epoch timer periodically.
Hope that could be useful. I didn't have any prior experience with wasm or wasmtime so everything there is likely very amateur.
The text was updated successfully, but these errors were encountered: