Nodegroup messages are too low a priority #3674

lvkale · 2022-12-01T16:02:26Z

I noted this in a tram-replacement experiment, with histogram code from bale suite. Nodegroup messages carrying boatload of data were arriving on the node but not being processed as all PEs were busy cycling through there own locally generated work. (Although the work generation method was broken down using CThYield() every x iterations). @nilsdeppe also seconded this request on slack.
What is happening: scheduler's function for finding next message always processes non-local (i.e. messages from others but meant for my PE) and local (from my pe) messages, so nodeGroup messages, although they can be processed by any PE, get delayed.
The request is to allow nodegroup messages to execute earlier.. somehow.

lvkale · 2022-12-01T16:05:53Z

A nuance of intent: Do we want to think of nodegroup messages as a within-node load balancing mechanism also? If used for that purpose, it should be low priority. "Finish all your local work assigned to you and then pay attention to nodegroup messages" makes sense for this scenario. However we do have task queue for that purpose. And nodegroup messages have another purpose: to quickly process data arriving from outside the node. (Basically saying: any pe, please execute this, it is urgent so I don't want to wait for any particular PE to become available). The common use case appears to be the latter, so its worth fixing.

lvkale · 2022-12-01T16:08:21Z

I have tested a fix in my copy, which works, but can sometimes slow things down. (I think because checking the nodeq, which is accessed via lock/fence, has a cost.. But also because changing order in which things happen can have unexpected consequences). But mostly it does what we expect. I will work on making a good branch with that fix and push a PR. (The idea is to check nodegroup messages with highest priority once in every K iterations of the scheduler.)

stwhite91 · 2022-12-01T16:09:52Z

If you want a nodegroup entry method to be executed urgently upon arrival, isn't that what [immediate] is for?

lvkale · 2022-12-01T18:25:12Z

Immediate will execute on commthread, which we don't want in this case. (comm thread will become a bottleneck). Also, we do want to parallelize/distribute the (not insignificant) work of handling incoming messages to all PEs... not for each message, but across them.

lvkale linked a pull request Dec 3, 2022 that will close this issue

Modifies converse schedeuler to prioritize NodeGroup messages #3676

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nodegroup messages are too low a priority #3674

Nodegroup messages are too low a priority #3674

lvkale commented Dec 1, 2022

lvkale commented Dec 1, 2022

lvkale commented Dec 1, 2022

stwhite91 commented Dec 1, 2022

lvkale commented Dec 1, 2022 •

edited

Loading

Nodegroup messages are too low a priority #3674

Nodegroup messages are too low a priority #3674

Comments

lvkale commented Dec 1, 2022

lvkale commented Dec 1, 2022

lvkale commented Dec 1, 2022

stwhite91 commented Dec 1, 2022

lvkale commented Dec 1, 2022 • edited Loading

lvkale commented Dec 1, 2022 •

edited

Loading