Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional outputs extension #6046

Merged
merged 15 commits into from
Apr 29, 2024

Conversation

oliver-sanders
Copy link
Member

@oliver-sanders oliver-sanders commented Apr 3, 2024

Summary

  • Implement optional output extension proposal.
  • Closes optional outputs: implement new proposal #5640
  • Closes clock-expire: prevent active tasks from being clock-expired #6025
  • Add a user-configurable completion expression.
    • Automatically define a default when not specified.
  • Change completion rules for the expire output.
    • This allows expired tasks to be retained in the pool as "incomplete".
  • Expose the completion expression to the data store
    • Add the completion expression to the protobuf and GraphQL schemas.
    • Make the completion expression visible to the data store.
    • Display the completion status in "cylc show".

Note

As the result of recent terminology wrangling, the term "incomplete task" has now been dropped. I have not gone through the cylc-flow codebase to strip these references out (that's a separate task), but I have tried to avoid introducing any new references on this PR.

Approach

The approach taken on this PR is to replace the current logic for evaluating task output completion with a single expression that returns True/False.

Implicit (i.e. derived) Expressions

This expression is automatically determined from the use of optional outputs in the graph.

Take the following workflow for example.

[scheduler]
    allow implicit tasks = True

[scheduling]
    [[graph]]
        R1 = """
            a:succeeded? => foo
            a:failed? => bar
            foo | bar => baz

            b:expired? => e
            b:succeeded? => s
            b:failed? =? f
            e | s | f => pub
        """

The derived completion expressions are exposed in the config, so we can inspect the derived completion expressions with cylc config:

$ cylc config .
[scheduler]
    allow implicit tasks = True
[scheduling]
    cycling mode = integer
    initial cycle point = 1
    final cycle point = 1
    [[graph]]
        R1 = """
            a:succeeded? => foo
            a:failed? => bar
            foo | bar => baz
            
            b:expired? => e
            b:succeeded? => s
            b:failed? =? f
            e | s | f => pub
        """
[runtime]
    [[root]]
    [[e]]
        completion = succeeded
    [[b]]
        completion = succeeded or failed or expired
    [[s]]
        completion = succeeded
    [[foo]]
        completion = succeeded
    [[a]]
        completion = succeeded or failed
    [[bar]]
        completion = succeeded
    [[baz]]
        completion = succeeded
    [[pub]]
        completion = succeeded
    [[f]]
        completion = succeeded

By default tasks must succeed. If failure, expiry or submission failure are permitted, these are added onto the end of the expression with an or operator.

The derived expressions should all match current behaviour with the exception of the handling of the expired state, which may now lead to incomplete outputs and potentially awkward edge cases with no required outputs which may not have been properly handled before.

Explicit (i.e. user-provided) expressions

The completion expression can be overridden for a task. If a completion expression is provided, then it must be logically consistent with the use of optional outputs in the graph.

E.G. Here the outputs a:x, a:y and a:z are marked as required in the graph:

[scheduler]
    allow implicit tasks = True

[scheduling]
    [[graph]]
        R1 = """
            a:x => x
            a:y => y
            a:z => z
        """

[runtime]
    [[a]]
        completion = succeeded and (x or y or z)
        [[[outputs]]]
            x = xxx
            y = yyy
            z = zzz

However, they are optional in the completion expression (because the completion expression can be satisfied without a:x we can deduce that a:x is optional), so this will result in an error:

$ cylc val .
WorkflowConfigError: a:y is required in the graph, but optional in the completion expression
succeeded and (x or y or z)

To fix the error, change either the completion expression or the graph to match, e.g:

diff --git a/flow.cylc b/flow.cylc
index 12a7f70..5b91853 100644
--- a/flow.cylc
+++ b/flow.cylc
@@ -4,9 +4,9 @@
 [scheduling]
     [[graph]]
         R1 = """
-            a:x => x
-            a:y => y
-            a:z => z
+            a:x? => x
+            a:y? => y
+            a:z? => z
         """
 
 [runtime]

Compatibility Mode

Completion expressions are also used for compatibility mode, however, the logic remains as previously implemented:

[scheduling]
    [[dependencies]]
        [[[R1]]]
            graph = foo
$ cylc config .
2024-04-03T14:00:52+01:00 WARNING - Backward compatibility mode ON
[scheduling]
    cycling mode = integer
    initial cycle point = 1
    final cycle point = 1
    [[graph]]
        R1 = foo
[runtime]
    [[root]]
    [[foo]]
        completion = (submitted and succeeded)

Visibility

Derived completion expressions are added to the config to make them visible via cylc config.

Completion expressions have also been added to the Protobuf and GraphQL schemas so they can be queried from the CLI, GUI and Tui.

I've built a simple text based reporting thinggy to help expose to users why a task's outputs are or aren't complete, here's how it looks in the logs:

Screenshot from 2024-04-03 14-05-33

And here's how it looks in cylc-show:

Screenshot from 2024-04-03 14-07-46

(note, switched from +/- to /x as some folks have been confused by - for some reason)

So the completion expression status is visible in the workflow logs (for final tasks with incomplete outputs), on the CLI (via cylc show) and in the Tui (via the show context option). This will become exposed in the GUI via the "Metadata View".

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • Tests are included (or explain why tests are not needed).
  • CHANGES.md entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at docs: submit-failed must be optional cylc-doc#712
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

@oliver-sanders oliver-sanders added this to the cylc-8.3.0 milestone Apr 3, 2024
@oliver-sanders oliver-sanders self-assigned this Apr 3, 2024
@hjoliver
Copy link
Member

hjoliver commented Apr 3, 2024

The derived completion expressions are exposed in the config, so we can inspect the derived completion expressions with cylc config

That is very cool 🎖️

(note, switched from +/- to ✓/x as some folks have been confused by - for some reason)

Good call.

Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completed a first pass through the non-test code - it looks really good.

cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/task_outputs.py Outdated Show resolved Hide resolved
cylc/flow/task_outputs.py Outdated Show resolved Hide resolved
cylc/flow/task_outputs.py Outdated Show resolved Hide resolved
cylc/flow/task_outputs.py Outdated Show resolved Hide resolved
cylc/flow/task_outputs.py Outdated Show resolved Hide resolved
@hjoliver
Copy link
Member

hjoliver commented Apr 4, 2024

$ cylc show ...
output completion: incomplete
  x |  succeeded

The output completion formatting here could easily be misinterpreted as saying the incomplete condition is "output x OR succeeded", especially as x is a super-common trigger name.

Maybe change to "vector cross product" and "dotted fence" - see side PR oliver-sanders#68

image

@hjoliver
Copy link
Member

hjoliver commented Apr 5, 2024

Had a good play with this today, nice.

@oliver-sanders
Copy link
Member Author

Tests now mostly there.

@oliver-sanders oliver-sanders force-pushed the optional-outputs-extension branch 2 times, most recently from abf2708 to 6420012 Compare April 12, 2024 13:44
@oliver-sanders
Copy link
Member Author

oliver-sanders commented Apr 12, 2024

Tests are all passing locally, but these two are kicking up a fuss on CI for as yet undetermined reasons:

  • tests/f/spawn-on-demand/19-submitted-compat.t
  • tests/f/optional-outputs/01-stall-on-incomplete.t

[edit] Still a mystery, but got it fixed, it's something to do with newlines in greps, ensure multiline patterns end in a newline and it seems to work.

@oliver-sanders oliver-sanders marked this pull request as ready for review April 12, 2024 14:34
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/config.py Outdated Show resolved Hide resolved
cylc/flow/cfgspec/workflow.py Outdated Show resolved Hide resolved
cylc/flow/config.py Outdated Show resolved Hide resolved
@MetRonnie MetRonnie added the config change Involves a change to global or workflow config label Apr 15, 2024
@oliver-sanders
Copy link
Member Author

@MetRonnie Don't we only use this label when we change a config rather than add one?

I'm not sure what the point of the "config change" label is? Why do we have it?

@MetRonnie
Copy link
Member

I'm not sure what the point of the "config change" label is? Why do we have it?

I think originally to keep track of all the config changes from 7 to 8, I see no reason not to continue using it

cylc/flow/task_outputs.py Outdated Show resolved Hide resolved
cylc/flow/config.py Outdated Show resolved Hide resolved
cylc/flow/util.py Outdated Show resolved Hide resolved
cylc/flow/config.py Outdated Show resolved Hide resolved
cylc/flow/util.py Outdated Show resolved Hide resolved
tests/unit/test_task_outputs.py Outdated Show resolved Hide resolved
tests/unit/test_task_outputs.py Outdated Show resolved Hide resolved
@MetRonnie
Copy link
Member

MetRonnie commented Apr 16, 2024

Example discussed in person:

[scheduling]
    [[graph]]
        R1 = """
            a? => b
            a:x => x
        """
[runtime]
    [[a]]
        completion = x and failed
        [[[outputs]]]
            x = x

validates without an error. But if you add a:fail? to the graph it gives

WorkflowConfigError: a:failed is optional in the graph (? symbol), but required in the completion expression:
x and failed

@hjoliver
Copy link
Member

hjoliver commented Apr 18, 2024

(I'll do a final(?) review and test of this Friday). Fail. Thwarted by a long all-staff meeting and urgent request to make sure cross-workflow triggering works for the Cylc 8 migration here #5809

@oliver-sanders
Copy link
Member Author

oliver-sanders commented Apr 18, 2024

Ronnie's example isn't being caught at present because a:fail? is implicit in the graph (from a?). Will add logic to handle this.

I'll try and get Ronnies example to fail validation and address the review comments, we're running another training session at the moment so might take a couple of days...

Copy link
Member

@MetRonnie MetRonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recent changes looks good.

How easy would it be to tolerate present-tense versions of the task state outputs (success, fail etc.)?

@oliver-sanders
Copy link
Member Author

How easy would it be to tolerate present-tense versions of the task state outputs (success, fail etc.)?

In completion expressions?

Possible, but we would need to run regex substitutions on each completion expression as we load it in the config, this would be sufficient:

s/\bsuccess\b/succeeded/g
s/\bfail\b/failed/g

Does make more sense to use the past tense in the completion expression though. Could potentially use validation to provide a more helpful error message for these cases?

@MetRonnie
Copy link
Member

Seeing as we support the present tense elsewhere and it is always shorter to type, I see value in supporting it here

@oliver-sanders
Copy link
Member Author

oliver-sanders commented Apr 23, 2024

I don't think that we should support the alternative names here, this muddies the water, makes expressions harder to validate and/or more expensive to evaluate. The expression will be reported (ala logs, cylc show, etc) using these canonicals which would make supporting these alternate forms confusing. I'll make the validation failures a little clearer (suggesting the past tense canonicals).

cylc/flow/config.py Outdated Show resolved Hide resolved
Copy link
Member

@MetRonnie MetRonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's well explained in the reference docs now. Can't think of any more functional examples to test out

Comment on lines +1131 to +1133
``succeeded or (failed and my_error)``
The task can fail, but only if it also yields the custom
output ``my_error``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output could be either my_error or my-error couldn't it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Member

@hjoliver hjoliver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • been through the code again, looks great
  • tested the hell out of it, works great

Nice work 👍

cylc/flow/cfgspec/workflow.py Show resolved Hide resolved
@hjoliver
Copy link
Member

Actually, after a bit more testing:

This fails validation with WorkflowConfigError: Undefined custom output: foo:x 👍 :

[scheduling]
    [[graph]]
        R1 = "foo:x => bar"
[runtime]
    [[foo]]
    [[bar]]

But this validates and runs without stalling, even though foo:x is declared as required, but isn't defined or completed 👎 :

[scheduling]
    [[graph]]
        R1 = "foo:x"
[runtime]
    [[foo]]
    [[bar]]

If I define the output (but don't generate it at run time), then it validates and stalls 👍

cylc/flow/config.py Outdated Show resolved Hide resolved
cylc/flow/task_proxy.py Outdated Show resolved Hide resolved
cylc/flow/config.py Outdated Show resolved Hide resolved
@hjoliver
Copy link
Member

Sorry Oliver, I came back again post-approval and found a couple of very minor issues. This is as good as done though, brilliant!

To get the coverage up a few percent, see also a side PR with some additional integration tests, for a handful of missed error conditions:

oliver-sanders#69

@hjoliver
Copy link
Member

To get this show on the road, I deconflicted the branch and pushed several small commits for my trivial and uncontroversial review suggestions. Two tweaks to the CI environment were needed too. See Element chat. Will merge if tests pass.

@hjoliver hjoliver merged commit 086e859 into cylc:master Apr 29, 2024
37 checks passed
@oliver-sanders oliver-sanders deleted the optional-outputs-extension branch April 29, 2024 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
config change Involves a change to global or workflow config
Projects
None yet
Development

Successfully merging this pull request may close these issues.

clock-expire: prevent active tasks from being clock-expired optional outputs: implement new proposal
3 participants