[RFC] Adding API for parallel block to task_arena to warm-up/retain/release worker threads #1522

pavelkumbrasev · 2024-10-01T14:07:13Z

Adding API for parallel block to task_arena to warm-up/retain/release worker threads

Signed-off-by: pavelkumbrasev <[email protected]>

vossmjp · 2024-10-01T19:39:52Z

rfcs/proposed/parallel_block_for_task_arena/README.md

+
+```cpp
+class task_arena {
+    void indicate_start_of_parallel_block(bool do_warmup = false);


How about

void retain_threads();
void release_threads();

*_parallel_block is misleading since even your example shows serial parts of the region.

Hmmm, I think retain_threads and release_threads provides unnecessary guaranteed like the threads will be actually retained.
Should it be something more relaxed?
Perhaps, make_sticky and make_unsticky suites better because we can set definition of sticky.

Do think "sticky" could people think of thread-to-core affinity? Since constraints are used for affinity, I'm thinking the likelihood of confusion is low and so I'm ok with make_sticky and make_unsticky.

Should not all this be about work rather than threads? Threads are the execution resources, that should not be exposed to the user, should they? I mean that is the original idea of the TBB library. Therefore, I suggest something like expect_[more/less_]parallel_work or assume_[more/less_]parallelism as a good level of a loose terminology what library should tend to "think" about user's code when this API is used.

TBB exposes some level of "thread logic" with observers. I'm not sure whether this API should expose this logic too.
If we want to extend these functions with additional guarantees such as "warm-up" or "leave earlier" perhaps we could not ignore threads completely.

vossmjp · 2024-10-01T19:54:49Z

rfcs/proposed/parallel_block_for_task_arena/README.md

+namespace this_task_arena {
+    void indicate_start_of_parallel_block(bool do_warmup = false);
+    void indicate_end_of_parallel_block(bool disable_default_block_time = false);
+    void disable_default_block_time();


The end-user doesn't know what the default block time is, and it will be platform dependent. The first set of functions indicate of region of interest, these default_block_time functions change a property on the task_arena that is not tied to a region. That makes me think it is better as a constraint. Are there known cases where this needs to be disabled then reenabled dynamically?

If the first two functions became something like retain_threads and release_threads, what would these be named? What about set_sleep_policy( sleep_quickly | sleep_slowly ) or something like that.

Perhaps, it is a good idea to move it constraints. If you need different guarantees just use different arenas the same is applicable to priorities.
If we include the property as part of the constraints which in turn represent HW Resources perhaps the name should represent how these resources will be used, like greedy or something.

@akukanov what do you think?

A couple questions:

How would users think about "greedy" relative to per-arena priorities? It might imply some kind of priority between a greedy normal arena and non-greedy normal arena, even though that wouldn't be the case.

In suggesting sleep_quickly and sleep_slowly, which I admit are not great names, I was trying to find something that indicated more about the wastefulness of holding onto resources once you have them and while there's nothing better to do with them in contrast to a greediness in acquiring resources, perhaps from some other competing arenas. I think this is the key point of disable|enable_default_block_time -- while it is a form of greediness, it is more about the amount of wastefulness tolerated to reduce startup latency on the next parallel algorithm.

When I am thinking about this, thoughts about being nice/responsive to the demand from other arenas appear in my head. But I am not sure how to better combine these two sets of API as they kind of mutually exclusive to each other. Consider, expect more parallel work to appear but be responsive to resources demand from other arenas. Actually, this counterintuition applies to the proposed design as well. Perhaps, we need to express that mutual exclusiveness somehow in the API.

expect more parallel work to appear but be responsive to resources demand from other arenas.

What part of the proposal is stating this? (Expects both properties simultaneously)

It is not stating explicitly, but sort of implying the question What will it mean if I invoke indicate_start_of_parallel_block and call disable_default_block_time right after that?

aleksei-fedotov

Overall, it looks as too certain about the things that will or will not happen when the new API is utilized. I think that the explanation should be written in a more vague terms using the more of "may", "might", etc. words. Essentially, conveying the idea that all this is up to the implementation and serve as a hint rather than a concrete behavior.

What do others think?

rfcs/proposed/parallel_block_for_task_arena/README.md

aleksei-fedotov · 2024-10-07T13:05:44Z

rfcs/proposed/parallel_block_for_task_arena/README.md

+
+```cpp
+class task_arena {
+    void indicate_start_of_parallel_block(bool do_warmup = false);


Should not all this be about work rather than threads? Threads are the execution resources, that should not be exposed to the user, should they? I mean that is the original idea of the TBB library. Therefore, I suggest something like expect_[more/less_]parallel_work or assume_[more/less_]parallelism as a good level of a loose terminology what library should tend to "think" about user's code when this API is used.

aleksei-fedotov · 2024-10-07T13:46:44Z

rfcs/proposed/parallel_block_for_task_arena/README.md

+namespace this_task_arena {
+    void indicate_start_of_parallel_block(bool do_warmup = false);
+    void indicate_end_of_parallel_block(bool disable_default_block_time = false);
+    void disable_default_block_time();


When I am thinking about this, thoughts about being nice/responsive to the demand from other arenas appear in my head. But I am not sure how to better combine these two sets of API as they kind of mutually exclusive to each other. Consider, expect more parallel work to appear but be responsive to resources demand from other arenas. Actually, this counterintuition applies to the proposed design as well. Perhaps, we need to express that mutual exclusiveness somehow in the API.

Co-authored-by: Aleksei Fedotov <[email protected]>

pavelkumbrasev · 2024-10-07T14:12:44Z

Overall, it looks as too certain about the things that will or will not happen when the new API is utilized. I think that the explanation should be written in a more vague terms using the more of "may", "might", etc. words. Essentially, conveying the idea that all this is up to the implementation and serve as a hint rather than a concrete behavior.

What do others think?

I tried to indicate that this set of APIs is a hint to the scheduler. But if you believe that we can relax this guarantees even more I think we should do this.

Initial commit

215a17d

Signed-off-by: pavelkumbrasev <[email protected]>

pavelkumbrasev requested review from aleksei-fedotov, vossmjp, akukanov, dnmokhov and isaevil October 1, 2024 14:07

vossmjp reviewed Oct 1, 2024

View reviewed changes

vossmjp changed the title ~~Adding API for parallel block to task_arena to warm-up/retain/release worker threads~~ [RFC} Adding API for parallel block to task_arena to warm-up/retain/release worker threads Oct 3, 2024

vossmjp changed the title ~~[RFC} Adding API for parallel block to task_arena to warm-up/retain/release worker threads~~ [RFC] Adding API for parallel block to task_arena to warm-up/retain/release worker threads Oct 3, 2024

aleksei-fedotov reviewed Oct 7, 2024

View reviewed changes

Apply suggestions from code review

74bf599

Co-authored-by: Aleksei Fedotov <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Adding API for parallel block to task_arena to warm-up/retain/release worker threads #1522

[RFC] Adding API for parallel block to task_arena to warm-up/retain/release worker threads #1522

pavelkumbrasev commented Oct 1, 2024 •

edited

Loading

vossmjp Oct 1, 2024

pavelkumbrasev Oct 2, 2024

vossmjp Oct 3, 2024

aleksei-fedotov Oct 7, 2024

pavelkumbrasev Oct 7, 2024

vossmjp Oct 1, 2024

pavelkumbrasev Oct 2, 2024

vossmjp Oct 3, 2024

aleksei-fedotov Oct 7, 2024

pavelkumbrasev Oct 7, 2024 •

edited

Loading

aleksei-fedotov Oct 7, 2024

aleksei-fedotov left a comment

aleksei-fedotov Oct 7, 2024

aleksei-fedotov Oct 7, 2024

pavelkumbrasev commented Oct 7, 2024

[RFC] Adding API for parallel block to task_arena to warm-up/retain/release worker threads #1522

Are you sure you want to change the base?

[RFC] Adding API for parallel block to task_arena to warm-up/retain/release worker threads #1522

Conversation

pavelkumbrasev commented Oct 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pavelkumbrasev Oct 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aleksei-fedotov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pavelkumbrasev commented Oct 7, 2024

pavelkumbrasev commented Oct 1, 2024 •

edited

Loading

pavelkumbrasev Oct 7, 2024 •

edited

Loading