Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: cancel workflow#127 #139

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions docs/further.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,6 @@ Usually, it is advisable to persist such settings via a
[configuration profile](https://snakemake.readthedocs.io/en/latest/executing/cli.html#profiles), which
can be provided system-wide, per user, and in addition per workflow.

The executor waits per default 40 seconds for its first check of the job status. Using `--slurm-init-seconds-before-status-checks=<time in seconds>` this behaviour can be altered.

## Ordinary SMP jobs

Most jobs will be carried out by programs that are either single-core
Expand Down Expand Up @@ -181,6 +179,12 @@ rule myrule:

Again, rather use a [profile](https://snakemake.readthedocs.io/en/latest/executing/cli.html#profiles) to specify such resources.

## Configuring Run Time Behaviour

The executor waits per default 40 seconds for its first check of the job status. Using `--slurm-init-seconds-before-status-checks=<time in seconds>` this behaviour can be altered.

Snakemake will abort local runs upon failure. Using the `--keep-going` flag, Snakemake will proceed to submit independent jobs, if a job fails. This plugin offers an additional flag to cancel the entire workflow, when a job fails - this might be helpful during development: `--slurm-cancel-workflow-upon-failure`. Note, that you can use `--rerun-incomplete`/`--ri` to proceed after a failed workflow (and fixing parameters), as usual.

## Software Recommendations

### Conda, Mamba
Expand Down
14 changes: 14 additions & 0 deletions snakemake_executor_plugin_slurm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,18 @@ class ExecutorSettings(ExecutorSettingsBase):
"required": False,
},
)
cancel_workflow_upon_failure: bool = field(
default=False,
metadata={
"help": """
Negates the `--keep-going` flag in Snakemake.
If set to True, the entire workflow will be canceled
upon recognition of a failed job.
""",
"env_var": False,
"required": False,
},
)


# Required:
Expand Down Expand Up @@ -376,6 +388,8 @@ async def check_active_jobs(
)
self.report_job_error(j, msg=msg, aux_logs=[j.aux["slurm_logfile"]])
active_jobs_seen_by_sacct.remove(j.external_jobid)
if self.settings.cancel_workflow_upon_failure:
self.cancel_jobs(active_jobs)
else: # still running?
yield j

Expand Down
Loading