You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PaSh currently does not do any rebalancing of outputs between stages of the pipeline with the same width. This could end up in pathological scenarios, e.g., when the input of a program cat IN | cmd1 | cmd2 is one line, cmd1 and cmd2 are both stateless, and cmd1 creates a bunch of lines that can then be processed by cmd2 in parallel, PaSh will not get any parallelism in this case.
To help identify such pathological scenarios it would be great if we could add a flag that adds logging nodes in parts of the dataflow that print how many lines and bytes go through them.
The steps to get this done would be to:
Implement a command that simply forwards its input to its output (no buffering like dgsh-tee), but also measures and prints the number of bytes and lines at the end.
Add this command after each stage of the dataflow graph to get the load for each different parallel line of the graph
(optional) Create a simple post-processing tool that can present the output in a nice way (relative loads or even in a plot, see for example --graphviz option in current PaSh).
The text was updated successfully, but these errors were encountered:
PaSh currently does not do any rebalancing of outputs between stages of the pipeline with the same width. This could end up in pathological scenarios, e.g., when the input of a program
cat IN | cmd1 | cmd2
is one line, cmd1 and cmd2 are both stateless, and cmd1 creates a bunch of lines that can then be processed by cmd2 in parallel, PaSh will not get any parallelism in this case.To help identify such pathological scenarios it would be great if we could add a flag that adds logging nodes in parts of the dataflow that print how many lines and bytes go through them.
The steps to get this done would be to:
--graphviz
option in current PaSh).The text was updated successfully, but these errors were encountered: