Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental wasmtime/redpanda integration #1

Open
Lazin opened this issue Mar 30, 2023 · 3 comments
Open

Experimental wasmtime/redpanda integration #1

Lazin opened this issue Mar 30, 2023 · 3 comments

Comments

@Lazin
Copy link

Lazin commented Mar 30, 2023

Hi,
I did something similar with redpanda the other day. The approach that I took with consume fuel was to reset the fuel back after every yield. This way the warm function could potentially run indefinitely. I also tried to use epoch termination and found that there is little difference in terms of performance. And it's difficult to use with seastar/redpanda because it requires external thread that will update the epoch timer periodically.

Hope that could be useful. I didn't have any prior experience with wasm or wasmtime so everything there is likely very amateur.

@rockwotj
Copy link
Owner

I did something similar with redpanda the other day.

Nice! 😎 Thanks for sharing!

The approach that I took with consume fuel was to reset the fuel back after every yield. This way the warm function could potentially run indefinitely.

Yeah I think there is a tradeoff here and I'm not sure what the right one is (maybe the answer is it's configurable...). I think if a developer uploads a function that has a bug and never terminates we probably want to eventually stop it and report an error? I think error handling needs a lot of thought of what the right behavior here is. (How do you report this back to developers? Do you stop processing new work? Etc).

And it's difficult to use with seastar/redpanda because it requires external thread that will update the epoch timer periodically.

Yeah the documentation states that fuel can be up to 2x-3x slower but I didn't see that and avoiding an alien thread to update the epoch timer seems like a win!

Hope that could be useful. I didn't have any prior experience with wasm or wasmtime so everything there is likely very amateur.

You and me both! It is helpful. I saw that you enabled WASI - do you have any thoughts on WASI in redpanda? Allowing access to network and disk seems like it could be powerful, but I think we'd need to build our own WASI layer that is built ontop of seastar disk + network primitives. The defaults for wasmtime I believe using blocking syscalls.

@Lazin
Copy link
Author

Lazin commented Apr 3, 2023

I think if a developer uploads a function that has a bug and never terminates we probably want to eventually stop it and report an error?

Current approach is to run a function for every record batch https://docs.redpanda.com/docs/labs/data-transform/ so yes. There should be a limit. But the limit should probably be expressed in seconds, not in cycles. Alternative approach could be to run a loop in WASM. This loop could fetch batches, transform them and then produce. In this case you can run a WASM function per partition and you can implement something like a top admin command for redpanda to see which one of those functions consumes CPU the most. But I guess this is also possible with first approach.

I saw that you enabled WASI - do you have any thoughts on WASI in redpanda?

I didn't have a chance to expose any host functions to WASM yet. But my preliminary research suggested that WASI is a way to do this.

@rockwotj
Copy link
Owner

rockwotj commented Apr 3, 2023

But the limit should probably be expressed in seconds, not in cycles.

💯 This is a great point - I agree.

Alternative approach could be to run a loop in WASM. This loop could fetch batches, transform them and then produce.

I've been thinking about this in terms of an event based model vs a sidecar model. The event based model is what is implemented now and the sidecar model is what you're describing here.

I think I prefer the event model at first take - it gives Redpanda more control over the lifecycle and when things happen. So I think this makes error handling more straightforward and the API contract more clear without being overly restrictive.

I do like the top admin command. Inspection, instrumentation and the like are important here.

I didn't have a chance to expose any host functions to WASM yet. But my preliminary research suggested that WASI is a way to do this.

I think there is a subset of WASI we'll want to expose. Logging to stdout/stderr, environment variables (for configuration) and clocks will all be useful to take advantage of upstream tooling. I'm not quite sure about other features of it like I/O. I think we'll want to expose a Redpanda specific interface for I/O if any.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants