Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: #97 preventing node confinment #98

Merged
merged 4 commits into from
Jun 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions docs/further.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## The general Idea

To use this plugin, log in to your cluster's head node (sometimes called the "login" node), activate your environment as usual and start Snakemake. Snakemake will then submit your jobs as cluster jobs.
To use this plugin, log in to your cluster's head node (sometimes called the "login" node), activate your environment as usual, and start Snakemake. Snakemake will then submit your jobs as cluster jobs.

## Specifying Account and Partition

Expand Down Expand Up @@ -86,6 +86,8 @@ other systems, e.g. by replacing `srun` with `mpiexec`:
$ snakemake --set-resources calc_pi:mpi="mpiexec" ...
```

To submit "ordinary" MPI jobs, submitting with `tasks` (the MPI ranks) is sufficient. Alternatively, on some clusters, it might be convenient to just configure `nodes`. Consider using a combination of `tasks` and `cpus_per_task` for hybrid applications (those that use ranks (multiprocessing) and threads). A detailed topology layout can be achieved using the `slurm_extra` parameter (see below) using further flags like `--distribution`.

## Running Jobs locally

Not all Snakemake workflows are adapted for heterogeneous environments, particularly clusters. Users might want to avoid the submission of _all_ rules as cluster jobs. Non-cluster jobs should usually include _short_ jobs, e.g. internet downloads or plotting rules.
Expand Down Expand Up @@ -158,8 +160,7 @@ set-resources:
## Additional Custom Job Configuration

SLURM installations can support custom plugins, which may add support
for additional flags to `sbatch`. In addition, there are various
`sbatch` options not directly supported via the resource definitions
for additional flags to `sbatch`. In addition, there are various batch options not directly supported via the resource definitions
shown above. You may use the `slurm_extra` resource to specify
additional flags to `sbatch`:

Expand Down Expand Up @@ -210,7 +211,7 @@ shared-fs-usage:
local-storage-prefix: "<your node local storage prefix>"
```

It will set the executor to be this SLURM executor, ensure sufficient file system latency and allow automatic stage-in of files using the [file system storage plugin](https://github.com/snakemake/snakemake-storage-plugin-fs).
It will set the executor to be this SLURM executor, ensure sufficient file system latency, and allow automatic stage-in of files using the [file system storage plugin](https://github.com/snakemake/snakemake-storage-plugin-fs).

Note, that you need to set the `SNAKEMAKE_PROFILE` environment variable in your `~/.bashrc` file, e.g.:

Expand Down
17 changes: 13 additions & 4 deletions snakemake_executor_plugin_slurm/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,14 +123,23 @@ def run_job(self, job: JobExecutorInterface):
"- submitting without. This might or might not work on your cluster."
)

# MPI job
if job.resources.get("mpi", False):
if job.resources.get("nodes", False):
call += f" --nodes={job.resources.get('nodes', 1)}"
if job.resources.get("nodes", False):
call += f" --nodes={job.resources.get('nodes', 1)}"

# fixes #40 - set ntasks regarlless of mpi, because
# SLURM v22.05 will require it for all jobs
call += f" --ntasks={job.resources.get('tasks', 1)}"
# MPI job
if job.resources.get("mpi", False):
if not job.resources.get("tasks_per_node") and not job.resources.get(
"nodes"
):
self.logger.warning(
"MPI job detected, but no 'tasks_per_node' or 'nodes' "
"specified. Assuming 'tasks_per_node=1'."
"Probably not what you want."
)

call += f" --cpus-per-task={get_cpus_per_task(job)}"

if job.resources.get("slurm_extra"):
Expand Down
Loading