-
-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request, Nodejs 10.5+] Execute Workers inside worker threads #253
Comments
Closing for now... until that API becomes a little more stable... |
@evantahler worker threads are stable as of Node v12 and can be polyfilled for older versions of node using a lib like https://github.com/chjj/bthreads. |
@naz cool! Can you share some of the benefits you'd like to see with a threads implementation? Of course, moving CPU-bound jobs to another thread is a good idea. I'm a little worried about the need to really re-instantiate the whole process to get a worker ( Either way, I think the place to try this out would be inside the multiWorker - with each worker (node-resque) being a new worker (node.js). I see some grammar issues in our future! |
perhaps re-open this issue to keep it visible |
Hey @evantahler! For now, my main use-case has been around offloading CPU-intensive work out from main thread/even loop. The worker instance creation cost is a real concern to which I haven't found a good approach just yet. The best way to decrease the cost is using thread pool technique - example implementation/documented available in node docs. To be completely clear, I am not actively using node-resque. For my use-case all queuing/scheduling has to be done in memory. I am experimenting with bree at the moment and it uses bthreads under the hood to polyfill worker threads. bthreads has a worker pool implemented (haven't looked under the hood yet) and from the looks of it it's main purpose is parallelization instead of worker creation cost saving. I was researching node-resque's codebase to see how/why things are done a certain way 😅 Didn't see worker thread utilized here and though pinging would spark up a conversation. Would be happy to use this issue as discussion ground for best approaches and knowledge sharing in the context of background job processing! |
@naz yeah, let's chat! My world-view is roughly that these are the types of background task systems that can exist (from https://blog.evantahler.com/background-tasks-in-node-js-a-survey-with-redis-971d3575d9d2) ... and when I talk about background tasks, I generally mean those that are:
So with that worldview, node-resque really zooms in on the use-case of an API deployed across multiple servers. I think in your case, you are working on what I called |
We are on the same page about the world-view and you are spot on the case I'm trying to solve right now. In the future there will be a need to have a hybrid solution where the core of processing foreground/parallel/local messages are all done by the same "job manager" with an option of giving the manager a way to have it's work queue persisted. In other words, the job manager will be able to change it's task strategy What I think this project might gain from using Worker (from worker_threads) or fork of a process (from child_process) is a utility aspect (communication is something to solve but doesn't have to be immediately imo). The utility of having separate worker thread or forked process would be "sandboxing" workers from the parent event loop allowing them to: fail or leak memory without crashing the parent process, introducing non blocking parallelism in case there are multiple CPU intensive jobs to be done, being able to terminate jobs that have been stuck. With above in mind, don't think there should be much of the API change in node-resque's side apart from allowing to create a new(modified) type of Worker that "forks" into a thread or child process. Because of the idempotent nature of background tasks, worker definition should be ideally self-contained - should be able to connect to resources without any additional inputs except few parameters specific to a task (comes with an overhead of recreating all the connections). Maybe I'm way off with this thinking, but hopefully it helps :) |
I think that makes a lot of sense, and is a good idea! I guess my concerns can all be met by making the use of Implementation Questions:
In either case, we would pass the name of the job and I think we really need to be clear about the limitations and isolation for using jobs = [{
sendEmail: async (userId) => {
const user = User.findOne(userId)
await emailThing(user).send()
}
}] This way of writing the job assume you have already connected your I'll try to get an example going soon! |
Lol - to test this out I decided to calculate Fibonacci numbers in background tasks while on laptop battery... that was not a smart idea. |
Just to keep some references around - breejs/bree#45, this is an issue in an alternative job manager lib. It will hopefully contain specific performance implications of running worker threads or forking processes (or might borrow data from here 😅). For my current usecase have decided to stick with bree for now as it's much more lightweight and easier to adjust to current in-memory queuing needs. Will be lurking around here for sure! |
Nodejs 10.5 has a new experimental feature: worker threads. It would be cool if node-resque would run its jobs inside worker threads.
one huge benefit I can think of is that you could kill stucked jobs.
The text was updated successfully, but these errors were encountered: