Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make cylc remove flow-aware and extend to historical tasks #6370

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

MetRonnie
Copy link
Member

@MetRonnie MetRonnie commented Sep 11, 2024

Partially addresses #5643

Summary

This mostly implements the "Cylc Remove Extension" proposal.

Flow numbers

cylc remove now has a --flow option for removing a task from specific flows.

If not used, it will remove the task from all flows that it belongs to.

If the removed task is active/waiting, if it is removed from a subset of flows that it belongs to, it will remain in the task pool; if it is removed from all flows that it belongs to, it will be removed from the task pool (as is the current behaviour).

If a task is removed from all flows that it belongs to, it will become a no-flow task (flow=None).

For ease of reviewing, you can use my UI branch that displays flow numbers: https://github.com/MetRonnie/cylc-ui/tree/flow-nums 1.

Historical tasks

cylc remove now can remove tasks that are no longer active, making it look like they never ran. It does this by removing the task from the specified flows in the database (in the task_states and task_outputs tables)2, and un-setting any prerequisites of active tasks that the removed task had naturally satisfied3. If a task is removed from all flows that it belongs to, a no-flow task is left in the DB for provenance.

The above also applies to active/waiting tasks that cylc remove is used on.

What's left to do

  • When removing an active task from all its flows, kill the task.

  • Should probably add a functional test with the --flow option.

  • Need to check this one:

    If removing a task causes all of the prerequisites on a downstream task to be unset (i.e. if the downstream task was spawned as a result of outputs from this task alone) then the downstream task shall be removed from the pool.

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • No dependency changes
  • Tests are included
  • Changelog entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

Footnotes

  1. Waiting tasks that are not yet in the pool have greyed out flow numbers at the moment.

  2. If removing flows would result in two rows in the DB no longer being unique, the SQLite UPDATE OR REPLACE statement is used, so the first entry will be removed and the most recent entry will remain.

  3. Prerequisites manually satisfied by cylc set --pre are not affected by cylc remove.

@MetRonnie MetRonnie added this to the 8.4.0 milestone Sep 11, 2024
@MetRonnie MetRonnie self-assigned this Sep 11, 2024
@MetRonnie MetRonnie marked this pull request as draft September 11, 2024 15:42
@MetRonnie MetRonnie changed the title cylc remove: make flow aware and extend to historical tasks Make cylc remove flow-aware and extend to historical tasks Sep 12, 2024
oliver-sanders
oliver-sanders previously approved these changes Sep 12, 2024
@oliver-sanders oliver-sanders dismissed their stale review September 12, 2024 14:45

Wrong PR, sorry.

@MetRonnie MetRonnie marked this pull request as ready for review September 13, 2024 16:43
Copy link
Member

@oliver-sanders oliver-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good!

tests/conftest.py Outdated Show resolved Hide resolved
tests/integration/conftest.py Outdated Show resolved Hide resolved
cylc/flow/workflow_db_mgr.py Show resolved Hide resolved
cylc/flow/task_pool.py Outdated Show resolved Hide resolved
cylc/flow/task_pool.py Outdated Show resolved Hide resolved
@MetRonnie MetRonnie marked this pull request as draft September 25, 2024 16:23
@MetRonnie MetRonnie marked this pull request as ready for review September 27, 2024 16:28
@oliver-sanders

This comment was marked as resolved.

@MetRonnie MetRonnie marked this pull request as draft October 2, 2024 14:56
@MetRonnie
Copy link
Member Author

(Test failure is just flaky tui test, not bothering re-running a 3rd time 🤮)

@MetRonnie MetRonnie marked this pull request as ready for review October 10, 2024 16:13
@oliver-sanders
Copy link
Member

oliver-sanders commented Oct 11, 2024

Have been trying this out for sub-graph re-run use cases. The remove functionality is all working correctly 👍, I am able to re-run sub-graphs cleanly without using new flows 🚀.

I have encountered some hitches (not related to this PR, all for consideration in follow-on work):

  1. I've been hitting this issue a lot (pool: check prereqs on task spawn #6143) with any dependencies which span more than one "rank" in the graph. In these situations, the GUI and DB say one thing, but the task pool another which is highly confusing. Restarting (or possibly reloading?) the workflow should fix it, but we need a solution to this before remove really becomes a viable solution to re-running sub-graphs. Note this is a very similar limitation to the need to satisfy off-flow prereqs when re-running via new flows.
  2. Removing a task doesn't necessarily change anything in the GUI (i.e. there is no visual feedback). This was anticipated when the proposal was written and is why visual filtering cylc-ui#470 is on the Cylc roadmap. I wonder whether a global change to the appearance of n=0 tasks (e.g. making them translucent as originally proposed) would provide an easier workaround and also help with other things?
  3. Removed tasks retain their state in the GUI. E.G. if a task succeeded and was removed, it continues to be shown with a filled circle. It makes sense to preserve the job state, however, I wonder if we should reset the task state back to waiting when we remove the task to make it clearer?

@oliver-sanders
Copy link
Member

oliver-sanders commented Oct 17, 2024

What's the status on these TODO items from the OP:

What's left to do

When removing an active task from all its flows, kill the task.

Should probably add a functional test with the --flow option.

Need to check this one:

@MetRonnie
Copy link
Member Author

MetRonnie commented Oct 17, 2024

Will either come in a follow-up PR or this one depending on how soon @hjoliver reviews this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants