You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today we have max-parallelism to control how many nodes will be started (asynchronously) per workflow mutation iteration. But in case of synchronous node this has no impact, because when node execution finishes, the state will be set to success that is a terminal state, which will not result in increment of parallelism; and the result is a worker hogging on a workflow.
This could be solve by turning node execution into asynchronous, but there are many cases a node execution cannot or does not make sense to be asynchronous, for example a node simply does a string processing in memory. For those cases, if the totally number of worker is small and there exists a few extremely big and heavy fan-out workflows, it is easy to have an unfair scheduling result where those workflows will keep all the workers busy while other small workflows will be waiting for long.
So it seems reasonable to have control of the max number of nodes being executed per workflow mutation iteration.
Goal: What should the final outcome look like, ideally?
This could either be achieved by changing semantics of max-parallelism for simplicity or introducing a new configuration for separation of concern.
Describe alternatives you've considered
We have considered turning some of the node executions into asynchronous, but there are node execution cannot or doesn't make sense to be asynchronous.
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
Yes
Have you read the Code of Conduct?
Yes
The text was updated successfully, but these errors were encountered:
Hello 👋, This issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will close the issue if we detect no activity in the next 7 days. Thank you for your contribution and understanding! 🙏
Hello 👋, This issue has been inactive for over 9 months and hasn't received any updates since it was marked as stale. We'll be closing this issue for now, but if you believe this issue is still relevant, please feel free to reopen it. Thank you for your contribution and understanding! 🙏
Hello 👋, this issue has been inactive for over 9 months. To help maintain a clean and focused backlog, we'll be marking this issue as stale and will engage on it to decide if it is still applicable.
Thank you for your contribution and understanding! 🙏
Motivation: Why do you think this is important?
Today we have
max-parallelism
to control how many nodes will be started (asynchronously) per workflow mutation iteration. But in case of synchronous node this has no impact, because when node execution finishes, the state will be set tosuccess
that is a terminal state, which will not result in increment of parallelism; and the result is a worker hogging on a workflow.This could be solve by turning node execution into asynchronous, but there are many cases a node execution cannot or does not make sense to be asynchronous, for example a node simply does a string processing in memory. For those cases, if the totally number of worker is small and there exists a few extremely big and heavy fan-out workflows, it is easy to have an unfair scheduling result where those workflows will keep all the workers busy while other small workflows will be waiting for long.
So it seems reasonable to have control of the max number of nodes being executed per workflow mutation iteration.
Goal: What should the final outcome look like, ideally?
This could either be achieved by changing semantics of
max-parallelism
for simplicity or introducing a new configuration for separation of concern.Describe alternatives you've considered
We have considered turning some of the node executions into asynchronous, but there are node execution cannot or doesn't make sense to be asynchronous.
Propose: Link/Inline OR Additional context
No response
Are you sure this issue hasn't been raised already?
Have you read the Code of Conduct?
The text was updated successfully, but these errors were encountered: