-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data store avoid graph re-walk #5660
Data store avoid graph re-walk #5660
Conversation
f10aa7b
to
65cb9fc
Compare
Coverage should improve with a n=2 window test. |
99a86e8
to
f110f63
Compare
714f758
to
58dbf6e
Compare
I've been having a play with this branch with the n=1 window in I've been playing with the following diff in order to track which children/parents are being walked over: diff --git a/cylc/flow/data_store_mgr.py b/cylc/flow/data_store_mgr.py
index b6a2960fb..e78cd767c 100644
--- a/cylc/flow/data_store_mgr.py
+++ b/cylc/flow/data_store_mgr.py
@@ -732,6 +732,7 @@ class DataStoreMgr:
None
"""
+ print(f'# increment_graph_window({source_tokens.relative_id})')
# common refrences
active_id = source_tokens.id
diff --git a/cylc/flow/taskdef.py b/cylc/flow/taskdef.py
index a4ae681e2..d2517383b 100644
--- a/cylc/flow/taskdef.py
+++ b/cylc/flow/taskdef.py
@@ -36,6 +36,7 @@ if TYPE_CHECKING:
def generate_graph_children(tdef, point):
"""Determine graph children of this task at point."""
+ print(f'# generate_graph_children({point}/{tdef.name})')
graph_children = {}
for seq, dout in tdef.graph_children.items():
for output, downs in dout.items():
@@ -78,6 +79,7 @@ def generate_graph_parents(tdef, point, taskdefs):
Infer parents be reversing upstream triggers that lead to point/task.
"""
+ print(f'# generate_graph_parents({point}/{tdef.name})')
graph_parents = {}
for seq, triggers in tdef.graph_parents.items():
if not seq.is_valid(point): Using the graph from your example above with the default n=1 window: flow.cylc[scheduler]
cycle point format = CCYY
[scheduling]
runahead limit = P1
initial cycle point = 2020
[[graph]]
P1Y = """
a => b => c => d => e => f => g
b => h => i => f
g[-P1Y] => a
"""
[runtime]
[[root]]
script = sleep 2
[[b, c, d, e, f, i]]
[[a]]
[[g,h]] It looks like the algorithm is looking one edge further infront ( E.G. here's a snippet from
E.G. in the above output, when the n=0 window was When the children of |
Moving onto the n=2 window (and dumping out the n-window after each call) with this diff: diff --git a/cylc/flow/data_store_mgr.py b/cylc/flow/data_store_mgr.py
index b6a2960fb..3a21bbbea 100644
--- a/cylc/flow/data_store_mgr.py
+++ b/cylc/flow/data_store_mgr.py
@@ -473,7 +473,7 @@ class DataStoreMgr:
self.updated_state_families = set()
# Update workflow state totals once more post delta application.
self.state_update_follow_on = False
- self.n_edge_distance = 1
+ self.n_edge_distance = 2
self.next_n_edge_distance = None
self.latest_state_tasks = {
state: deque(maxlen=LATEST_STATE_TASKS_QUEUE_SIZE)
@@ -732,6 +732,7 @@ class DataStoreMgr:
None
"""
+ print(f'# increment_graph_window({source_tokens.relative_id})')
# common refrences
active_id = source_tokens.id
@@ -1018,6 +1019,10 @@ class DataStoreMgr:
self.prune_trigger_nodes[active_id])
del self.prune_trigger_nodes[active_id]
+ task_proxies = self.data[self.workflow_id][TASK_PROXIES]
+ for tp_id in sorted(list(task_proxies) + list(self.added[TASK_PROXIES])):
+ print(f'$ {tp_id}')
+
def generate_edge(
self,
parent_tokens: Tokens,
diff --git a/cylc/flow/taskdef.py b/cylc/flow/taskdef.py
index a4ae681e2..d2517383b 100644
--- a/cylc/flow/taskdef.py
+++ b/cylc/flow/taskdef.py
@@ -36,6 +36,7 @@ if TYPE_CHECKING:
def generate_graph_children(tdef, point):
"""Determine graph children of this task at point."""
+ print(f'# generate_graph_children({point}/{tdef.name})')
graph_children = {}
for seq, dout in tdef.graph_children.items():
for output, downs in dout.items():
@@ -78,6 +79,7 @@ def generate_graph_parents(tdef, point, taskdefs):
Infer parents be reversing upstream triggers that lead to point/task.
"""
+ print(f'# generate_graph_parents({point}/{tdef.name})')
graph_parents = {}
for seq, triggers in tdef.graph_parents.items():
if not seq.is_valid(point): Here's the log:
Looks good, gives the right result, some possible savings to be made avoiding duplicate |
There are duplicate I spotted a [very] small optimisation moving that return a little higher up in the code: --- a/cylc/flow/data_store_mgr.py
+++ b/cylc/flow/data_store_mgr.py
@@ -1108,11 +1108,15 @@ class DataStoreMgr:
Orphan tasks with no children return (True, None) respectively.
"""
+ tp_id = tokens.id
+ task_proxies = self.data[self.workflow_id][TASK_PROXIES]
+ if tp_id in task_proxies or tp_id in self.added[TASK_PROXIES]:
+ # ghost task already created
+ return
+
name = tokens['task']
point_string = tokens['cycle']
t_id = self.definition_id(name)
- tp_id = tokens.id
- task_proxies = self.data[self.workflow_id][TASK_PROXIES]
is_orphan = False
if name not in self.schd.config.taskdefs:
@@ -1120,8 +1124,6 @@ class DataStoreMgr:
if itask is None:
itask = self.schd.pool.get_task(point_string, name)
- if tp_id in task_proxies or tp_id in self.added[TASK_PROXIES]:
- return
if itask is None:
itask = TaskProxy( |
Tried out the new graphDepth field, the numbers came out as expected so this should close #5609 which will allow us to unlock the higher n-window views in the GUI! Is the graph window that the data store is currently configured to track available in the schema? That would be handy for the UI. |
diff --git a/cylc/flow/data_store_mgr.py b/cylc/flow/data_store_mgr.py
index 03c77eb94..a085f2f3f 100644
--- a/cylc/flow/data_store_mgr.py
+++ b/cylc/flow/data_store_mgr.py
@@ -754,6 +754,7 @@ class DataStoreMgr:
# walk filling, that may not have been the entire walk).
# If walk already completed, must have gone from non-active to active
# again.. So redo walk (as walk nodes may be pruned).
+ print(f'# Active ID: {active_id}')
if (
active_id not in all_walks
or active_id in self.n_window_completed_walks
@@ -899,6 +900,7 @@ class DataStoreMgr:
)
continue
+ print(f'# Generate Children/Parents of {node_tokens.relative_id}')
# Children/downstream nodes
# TODO: xtrigger is workflow_state edges too
# Reference set for workflow relations
diff --git a/cylc/flow/taskdef.py b/cylc/flow/taskdef.py
index a4ae681e2..daae255f4 100644
--- a/cylc/flow/taskdef.py
+++ b/cylc/flow/taskdef.py
@@ -15,6 +15,7 @@
# along with this program. If not, see <http://www.gnu.org/licenses/>.
"""Task definition."""
+import inspect
from collections import deque
from typing import TYPE_CHECKING
@@ -36,6 +37,7 @@ if TYPE_CHECKING:
def generate_graph_children(tdef, point):
"""Determine graph children of this task at point."""
+ print(f'# Children of {point}/{tdef.name} - Called by {inspect.currentframe().f_back.f_code.co_name}')
graph_children = {}
for seq, dout in tdef.graph_children.items():
for output, downs in dout.items():
@@ -78,6 +80,7 @@ def generate_graph_parents(tdef, point, taskdefs):
Infer parents be reversing upstream triggers that lead to point/task.
"""
+ print(f'# parents of {point}/{tdef.name} - Called by {inspect.currentframe().f_back.f_code.co_name}')
graph_parents = {}
for seq, triggers in tdef.graph_parents.items():
if not seq.is_valid(point):
As you've spotted, I've been relying on
I think we can add that as a workflow field (if it's not already). I've also thought of an optimization for the depth finder, so will put that up in the morning. |
58dbf6e
to
bee6937
Compare
I changed it to reuse the
|
471ebc4
to
529c3ca
Compare
There were still some duplicate |
529c3ca
to
76aecbc
Compare
Done some more testing... Latest graph patch:diff --git a/src/components/cylc/SVGTask.vue b/src/components/cylc/SVGTask.vue
index c242a32c..d5925bb5 100644
--- a/src/components/cylc/SVGTask.vue
+++ b/src/components/cylc/SVGTask.vue
@@ -37,6 +37,10 @@ along with this program. If not, see <http://www.gnu.org/licenses/>.
queued: task.isQueued && !task.isHeld,
runahead: task.isRunahead && !(task.isHeld || task.isQueued),
}"
+ :style="`
+ --foreground: ${foreground};
+ --background: ${background};
+ `"
>
<!-- status
@@ -279,6 +283,31 @@ export default {
// otherwise the progress indicator may end up in the wrong place.
type: Number,
default: 0
+ },
+ colourMap: {
+ default: {
+ undefined: 'gray',
+ 0: 'hsl(0, 100%, 70%)',
+ 1: 'hsl(30, 100%, 70%)',
+ 2: 'hsl(60, 100%, 70%)',
+ 3: 'hsl(90, 100%, 70%)',
+ 4: 'hsl(120, 100%, 70%)',
+ 5: 'hsl(150, 100%, 70%)',
+ 6: 'hsl(180, 100%, 70%)',
+ 7: 'hsl(210, 100%, 70%)',
+ 8: 'hsl(240, 100%, 70%)',
+ 9: 'hsl(270, 100%, 70%)',
+ 10: 'hsl(300, 100%, 70%)',
+ 11: 'hsl(330, 100%, 70%)'
+ }
+ }
+ },
+ computed: {
+ foreground () {
+ return this.colourMap[this.task.graphDepth]
+ },
+ background () {
+ return 'white'
}
},
methods: {
@@ -342,19 +371,21 @@ export default {
</script>
<style lang="scss">
- $foreground: rgb(90,90,90);
- $background: rgb(255,255,255);
+ :root {
+ --foreground: rgb(90,90,90);
+ --background: rgb(255,255,255);
+ }
.c8-task {
.status {
.outline {
// NOTE: ensure the outline is filled so that it can be clicked
- fill: $background;
- stroke: $foreground;
+ fill: var(--background);
+ stroke: var(--foreground);
}
.progress {
fill: transparent;
- stroke: $foreground;
+ stroke: var(--foreground);
transform-origin: 50% 50%;
opacity: 0.4;
/* 0% progress */
@@ -397,75 +428,75 @@ export default {
}
&.preparing .status .dot {
- fill: $foreground;
+ fill: var(--foreground);
}
&.submitted .status .dot {
- fill: $foreground;
+ fill: var(--foreground);
}
&.running .status .hub {
- fill: $foreground;
+ fill: var(--foreground);
}
&.running .status .progress {
- fill: $foreground;
+ fill: var(--foreground);
}
&.succeeded .status .outline {
- fill: $foreground;
+ fill: var(--foreground);
}
&.failed .status {
.outline {
- fill: $foreground;
+ fill: var(--foreground);
}
.cross rect {
- fill: $background;
+ fill: var(--background);
}
}
&.submit-failed .status {
.outline {
- fill: $background;
+ fill: var(--background);
}
.cross rect {
- fill: $foreground;
+ fill: var(--foreground);
}
}
&.expired .status {
.outline {
- fill: $foreground;
+ fill: var(--foreground);
}
.dot {
- fill: $background;
+ fill: var(--background);
}
.expired rect {
- fill: $background;
+ fill: var(--background);
}
}
&.held .modifier {
.outline {
- stroke: $foreground;
+ stroke: var(--foreground);
}
.held rect {
- fill: $foreground;
+ fill: var(--foreground);
}
}
&.queued .modifier {
.queued rect {
- fill: $foreground;
+ fill: var(--foreground);
}
}
&.runahead .modifier {
.outline {
- stroke: $foreground;
+ stroke: var(--foreground);
}
.runahead circle {
- fill: $foreground;
+ fill: var(--foreground);
}
}
}
diff --git a/src/views/Graph.vue b/src/views/Graph.vue
index 95781fa9..7d5a62fb 100644
--- a/src/views/Graph.vue
+++ b/src/views/Graph.vue
@@ -154,6 +154,7 @@ fragment TaskProxyData on TaskProxy {
isRunahead
isQueued
name
+ graphDepth
task {
meanElapsedTime
}
@@ -654,6 +655,9 @@ export default {
for (const obj of json.objects) {
const [x, y] = obj.pos.split(',')
const bbox = nodeDimensions[obj.name]
+ if (!bbox) {
+ console.warn(`No Node: ${obj.name}`)
+ }
// translations:
// 1. The graphviz node coordinates
// 2. Centers the node on this coordinate
diff --git a/src/views/Table.vue b/src/views/Table.vue
index 316157ac..589730df 100644
--- a/src/views/Table.vue
+++ b/src/views/Table.vue
@@ -110,6 +110,7 @@ fragment TaskProxyData on TaskProxy {
isHeld
isQueued
isRunahead
+ graphDepth
task {
meanElapsedTime
}
diff --git a/src/views/Tree.vue b/src/views/Tree.vue
index f191e7c5..86b26bd5 100644
--- a/src/views/Tree.vue
+++ b/src/views/Tree.vue
@@ -135,6 +135,7 @@ fragment TaskProxyData on TaskProxy {
isHeld
isQueued
isRunahead
+ graphDepth
task {
meanElapsedTime
} Using longest rather than shortest n-window distanceSpotted whilst testing this example in the n=10 window: [scheduler]
allow implicit tasks = True
[scheduling]
initial cycle point = 1
cycling mode = integer
[[graph]]
P1 = """
a => b => c
b[-P1] => b
""" The task "c" should be n=2 for the first 6 cycles, but I actually got:
Note, it doesn't reproduce the same way each time?! Reloading resets graph windowNot strictly related to the changes on this PR. Would be good to squeeze in now if easy, otherwise can bump to an issue. Reloading the workflow appears to reset the n-window. E.G. if I set the window to n=2, then reload, it goes back to n=1. Graph errorsI've seen a couple of bits of traceback in the graph view with I.E. if there are sources or targets in I've seen this when the workflow is paused which is surprising as all deltas should have been applied (rules out race conditions in delta batching). I can't get a reproducible example at the moment so can't confirm the cause. Could be unrelated. |
* Sometimes tasks may appear in the n-window, then dissappear only to reappear at a later time. * This is a difficult case for the `increment_graph_window` to handle. * See cylc#5660
f68a63b
to
7ca4035
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
* Sometimes tasks may appear in the n-window, then dissappear only to reappear at a later time. * This is a difficult case for the `increment_graph_window` to handle. * See cylc#5660
Co-authored-by: Oliver Sanders <[email protected]>
* Test the added, updated and pruned deltas created by increment_graph_window as the window changes. * Test window resizing with n=1,2,3
7ca4035
to
bc1a017
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worked well in my testing
Profiling results for the following workflow in simulation mode: [task parameters]
x = 1..100
y = 1..100
[scheduling]
cycling mode = integer
initial cycle point = 1
final cycle point = 1
[[graph]]
P1 = d[-P1] => a => b<x> => c<y> => d
Good positive change, increases functionality whilst improving performance 💯! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all your hard work @dwsutherland, this looked like a tricky one!
Added changelog for #5660
addresses #5485, #5609
At present we've resolved an efficiency problem at n=1 window size by restricting the graph walk to only parents or only children (i.e. 'c' 'p', or 'ccc' 'ppp' for n=3) .. However this excludes cousins, which is evident at n>1 (i.e. 'cp' or 'pc' at n=2)
This restriction was a cheap way to avoid graph re-walks.. #5475
Here the problem is actually fixed at any window size, by changing the graph walk algorithm from a recursive to a walk location approach.. From the method's description:
Window resizing will now happen as soon as it's set (not as the graph progresses)
N-window edge depth of nodes is now found/updated with new field
graph_depth
.. A good example workflow:Gives:
Performance
Taken from #5435
$ cylc play <id> --profile --no-detach
Run this for a few mins, when you get bored ctrl+c it, then:
This PR is even faster than current (#5475) at
n=1
:At
n=2
there's a bit more load:but clearly not squared or cubed.
Check List
CONTRIBUTING.md
and added my name as a Code Contributor.setup.cfg
(andconda-environment.yml
if present).CHANGES.md
entry included if this is a change that can affect users