Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kill tasks during job prep. #5749

Closed
wants to merge 3 commits into from
Closed

Conversation

hjoliver
Copy link
Member

@hjoliver hjoliver commented Sep 30, 2023

Close #5746

This PR allows cylc kill to abort jobs that are in the job preparation pipeline, resulting in the submit-failed state.

To avoid dealing with the complexities of partial job preparation, I've take the following approach:

  • the kill command sets a flag in preparing task proxies
  • the job submission process aborts at the last minute, after preparation has completed
  • (so retriggering the task will result in a new submit number).

Tested by deliberately extending job preparation like this (and kill tasks whilst still in the preparing state)

platform = $(sleep 10; echo localhost)

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • Tests are included (or explain why tests are not needed).
  • CHANGES.md entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

@hjoliver hjoliver self-assigned this Oct 3, 2023
@oliver-sanders oliver-sanders added this to the cylc-8.4.0 milestone Feb 19, 2024
@MetRonnie
Copy link
Member

MetRonnie commented Nov 12, 2024

After discussing removing a preparing task with Oliver (for #6472), this will probably be even more important as the task might end up submitting and running outside of the task pool if not killed? Not sure, haven't tested yet

@MetRonnie
Copy link
Member

I am happy to take over this if needed, but more likely to end up in 8.4.1 rather than 8.4.0

@hjoliver
Copy link
Member Author

hjoliver commented Dec 1, 2024

Thanks @MetRonnie - yes this one slipped off my radar screen.

IMO this is less important to get into 8.4.0 than trigger-when-paused. It's kind of a bug fix, but fairly niche circumstances.

@MetRonnie MetRonnie added the bug? Not sure if this is a bug or not label Dec 2, 2024
@MetRonnie
Copy link
Member

MetRonnie commented Dec 2, 2024

Assigning myself for now. If you get round to re-tackling before 2025, then unassign me

@MetRonnie MetRonnie assigned MetRonnie and unassigned hjoliver Dec 2, 2024
@MetRonnie MetRonnie modified the milestones: 8.4.0, 8.4.1 Dec 2, 2024
@@ -483,6 +487,16 @@ def submit_task_jobs(self, workflow, itasks, curve_auth,
)
continue

if itask.killed_in_job_prep:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

itask is possibly unbound

Comment on lines +491 to +497
itask.waiting_on_job_prep = False
itask.killed_in_job_prep = False
itask.state_reset(TASK_STATUS_SUBMIT_FAILED)
self.data_store_mgr.delta_task_state(itask)
itask.local_job_file_path = None # reset for retry
self._prep_submit_task_job_error(
workflow, itask, '(killed in job prep)', '')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another problem is this does not get run for tasks that are cylc removed in the preparing state, as they are removed from the pool and do not get to the submission stage

@MetRonnie MetRonnie mentioned this pull request Dec 30, 2024
8 tasks
@MetRonnie MetRonnie assigned hjoliver and unassigned MetRonnie Dec 30, 2024
@hjoliver
Copy link
Member Author

hjoliver commented Jan 5, 2025

Superseded by #6535

@hjoliver hjoliver closed this Jan 5, 2025
@oliver-sanders oliver-sanders removed this from the 8.4.1 milestone Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug? Not sure if this is a bug or not superseded
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kill should abort preparing tasks
3 participants