-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling blocking tasks and long-running tasks #123
Comments
Hello, I'm writing web server that uses Weave: Handling blocking tasks is exactly the interesting topic there! There's a main event loop (at dispatcher.nim) that waits for sockets from selector and then spawns tasks that will mostly:
So tasks will spend a lot of time waiting, and there is limited possibility to inject those loadBalance-requests I have some preliminary benchmark results here, comparing 4-threaded, single-threaded and httpbeast -based serving and some statistics from weave. I have no idea how to interpret those weave statistics. If you could install GuildenStern and run the benchmarks yourself, I hope you might get some insight - not only about GuildenStern, but about how Weave behaves in this kind of environment. I'm certain that I'll have some questions to ask as my work advances. I'll try to write those as generic issues so that they may interesting also for larger audience. Keep up the good work! |
4 CPU cores should be plenty. Looking at your use-case, and especially the ThreadContext it seems like you need to pin some permanent data to a thread. I'll think about that, the "long-running tasks" part should handle that. One thing to note regarding latency is that Weave currently maximize throughput:
I'm not familiar with webserver but since there is a nice benchmark, I can play with it. |
Somewhat linked to #22 or even a solution.
Vocabulary reminder:
Confusingly, this is a different context from CPU-bound workload and memory-bound workload which is about resources while in our context we are about progress. Improvement in nomenclature welcome
Blocking tasks
Blocking tasks, for example
stdin.readline()
shouldn't be scheduled on Weave threadpool.While blocking, Weave would lose complete control of the thread and it will not be available for scheduling, this would mean:
Instead a
spawnBlocked
orspawnDedicated
function should be provided that will spin up a thread that does not participate in Weave scheduling and can be blocked safely.Open questions:
createThread
"just-in-time" or do we prepare one or some threads in advance?readline
will do so in awhile true
loop so recycling it is desired.when
?joinIfNonBlocking()
?Long-running tasks
A bit similar to the Actor model, we might want to spin up event loop tasks that should be run for a very long-time, re-using the image compression example, in pseudo-code:
This is all fine and dandy but:
spawnDedicated
Flowvar.isReady()
to allow async code to check ifsync()
will block.spawn
currently enqueues the task on the current worker queue. But a "dedicated" thread is not a worker on Weave threadpool and should enqueue (and wakeup) a random worker from the threadpool.Fair scheduling
Due to how Weave works, it maximizes for throughput, i.e. if a client requests processing on a big image, processing will take more time but it might not be fair to others.
I don't see how that can be solved without resumable functions, once a task is started it runs to completion.
Priority tasks
See Latency-optimized / job priorities / soft real-time parallel scheduling #88
Maximizing throughput in the async world.
In the image server example, the main thread never ever participates on the image processing, all it does is load balancing (i.e. waking up idle threads if tasks are pending).
However currently on a 4 cores CPU, Weave would spawn a threadpool with
3 threads and assume that the main thread will participate in the work.
In an async context (and the example), this is not true,
sync
,syncScope
andsyncRoot
, should not be called on the root task unless at shutdown to avoid blocking async event handling, so we only use 75% of available CPU for image processing.Instead we should make the number of threads an
init
input parameter so that there are 4 worker threads (mostly CPU) + 1 root thread (mostly IO) to maximize resource usage.Autodetect blocking tasks
We could add autodetection of blocking tasks and their migration outside the threadpool.
But I don't want to try to solve the Halting Problem
The text was updated successfully, but these errors were encountered: