Skip to content

Commit

Permalink
Use Apptainer with HTCondor by default and point to SPRAS v0.2.0
Browse files Browse the repository at this point in the history
Two issues solved with this commit -- first, Apptainer seems to be the way to go
when working with HTCondor, both in and out of the OSPool. Instead of having
instructions that say "if you encounter problem XXX, use Apptainer", this just
tells the user to build the Apptainer image in the first place.

Secondly, Neha encountered an issue while testing HTCondor compatibility where
the container's Snakefile version was incompatible with the version of the
Snakefile being transferred from the AP to the EP. This resulted in a confusing
error message and wouldn't be straight forward to recognize for most people, so
I decided I'd urge the users to build their own containers in the first place.
If they do this, then they can be sure there are no compatibility issues.
  • Loading branch information
jhiemstrawisc committed Sep 3, 2024
1 parent 4955b53 commit 2655d1b
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 36 deletions.
63 changes: 34 additions & 29 deletions docker-wrappers/SPRAS/README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
# SPRAS Docker image

## Building
## Building Images

A Docker image for SPRAS that is available on [DockerHub](https://hub.docker.com/repository/docker/reedcompbio/spras)
This image comes bundled with all of the necessary software packages to run SPRAS, and can be used for execution in distributed environments (like HTCondor).

To create the Docker image, make sure you are in this repository's root directory, and from your terminal run:
To create the Docker image locally, make sure you are in this repository's root directory, and from your terminal run:

```bash
docker build -t <project name>/<image name>:<tag name> -f docker-wrappers/SPRAS/Dockerfile .
```

For example, to build this image with the intent of pushing it to DockerHub as reedcompbio/spras:v0.1.0, you'd run:
For example, to build this image with the intent of pushing it to DockerHub as reedcompbio/spras:v0.2.0, you'd run:
```bash
docker build -t reedcompbio/spras:v0.1.0 -f docker-wrappers/SPRAS/Dockerfile .
docker build -t reedcompbio/spras:v0.2.0 -f docker-wrappers/SPRAS/Dockerfile .
```

This will copy the entire SPRAS repository into the container and install SPRAS with `pip`. As such, any changes you've made to the current SPRAS repository will be reflected in version of SPRAS installed in the container. Since SPRAS
Expand All @@ -38,27 +38,47 @@ Or to temporarily override your system's default during the build, prepend your
DOCKER_DEFAULT_PLATFORM=linux/amd64
```

For example, to build reedcompbio/spras:v0.1.0 on Apple Silicon as a linux/amd64 container, you'd run:
For example, to build reedcompbio/spras:v0.2.0 on Apple Silicon as a linux/amd64 container, you'd run:
```
DOCKER_DEFAULT_PLATFORM=linux/amd64 docker build -t reedcompbio/spras:v0.1.0 -f docker-wrappers/SPRAS/Dockerfile .
DOCKER_DEFAULT_PLATFORM=linux/amd64 docker build -t reedcompbio/spras:v0.2.0 -f docker-wrappers/SPRAS/Dockerfile .
```

## Testing
### Converting Docker Images to Apptainer/Singularity Images

The folder `docker-wrappers/SPRAS` also contains several files that can be used to test this container on HTCondor. To test the `spras` container
It may be necessary in some cases to create an Apptainer image for SPRAS, especially if you intend to run your workflow using distributed systems like HTCondor. Apptainer (formerly known as Singularity) uses image files with `.sif` extensions. Assuming you have Apptainer installed, you can create your own sif image from an already-built Docker image with the following command:
```bash
apptainer build <new image name>.sif docker://<name of container on DockerHub>
```

For example, creating an Apptainer image for the `v0.2.0` SPRAS image might look like:
```bash
apptainer build spras-v0.2.0.sif docker://reedcompbio/spras:v0.2.0
```

After running this command, a new file called `spras-v0.2.0` will exist in the directory where the command was run.

## Working with HTCondor

The folder `docker-wrappers/SPRAS` also contains several files that can be used to run workflows with this container on HTCondor. To use the `spras` image
in this environment, first login to an HTCondor Access Point (AP). Then, from the AP clone this repo:

```bash
git clone https://github.com/Reed-CompBio/spras.git
```

**Note:** To work with SPRAS in HTCondor, it is recommended that you build an Apptainer image instead of using Docker. See [Converting Docker Images to Apptainer/Singularity Images](#converting-docker-images-to-apptainersingularity-images) for instructions. Importantly, the Apptainer image must be built for the linux/amd64 architecture. Most HTCondor APs will have `apptainer` installed, but they may not have `docker`. If this is the case, you can build the image with Docker on your local machine, push the image to Docker Hub, and then convert it to Apptainer's `sif` format on the AP.

There are currently two options for running SPRAS with HTCondor. The first is to submit all SPRAS jobs to a single remote Execution Point (EP). The second
is to use the Snakemake HTCondor executor to parallelize the workflow by submitting each job to its own EP.

### Submitting All Jobs to a Single EP

Navigate to the `spras/docker-wrappers/SPRAS` directory and create the `logs/` directory. Then run `condor_submit spras.sub`, which will submit SPRAS
to HTCondor as a single job with as many cores as indicated by the `NUM_PROCS` line in `spras.sub`, using the value of `EXAMPLE_CONFIG` as the SPRAS
Navigate to the `spras/docker-wrappers/SPRAS` directory and create the `logs/` directory (`mkdir logs`). Next, modify `spras.sub` so that it uses the SPRAS apptainer image you created:
```
container_image = < your spras image >.sif
```

Then run `condor_submit spras.sub`, which will submit SPRAS to HTCondor as a single job with as many cores as indicated by the `NUM_PROCS` line in `spras.sub`, using the value of `EXAMPLE_CONFIG` as the SPRAS
configuration file. Note that you can alter the configuration file to test various workflows, but you should leave `unpack_singularity = true`, or it
is likely the job will be unsuccessful. By default, the `example_config.yaml` runs everything except for `cytoscape`, which appears to fail periodically
in HTCondor.
Expand All @@ -68,18 +88,13 @@ CHTC pool, omit the `+WantGlideIn` and `requirements` lines

### Submitting Parallel Jobs

Parallelizing SPRAS workflows with HTCondor requires two additional pieces of setup. First, it requires an activated SPRAS conda environment with a `pip install`-ed version of the SPRAS module (see the main `README.md` for detailed instructions on pip installation of SPRAS).
Parallelizing SPRAS workflows with HTCondor requires the same setup as the previous section, but with two additions. First, it requires an activated SPRAS conda environment with a `pip install`-ed version of the SPRAS module (see the main `README.md` for detailed instructions on pip installation of SPRAS).

Second, it requires an experimental executor for HTCondor that has been forked from the upstream [HTCondor Snakemake executor](https://github.com/htcondor/snakemake-executor-plugin-htcondor).

To get install this executor in the spras conda environment, clone the forked repository using the following:
```bash
git clone https://github.com/htcondor/snakemake-executor-plugin-htcondor.git
```

Then, from your activated `spras` conda environment (important), run:
After activating your `spras` conda environment and `pip`-installing SPRAS, you can install the HTCondor Snakemake executor with the following:
```bash
pip install snakemake-executor-plugin-htcondor/
pip install git+https://github.com/htcondor/snakemake-executor-plugin-htcondor.git
```

Currently, this executor requires that all input to the workflow is scoped to the current working directory. Therefore, you'll need to copy the
Expand All @@ -89,8 +104,7 @@ cp ../../Snakefile . && \
cp -r ../../input .
```

It's also necessary for this workflow to create an Apptainer image from the published Docker image. See [Creating an Apptainer image for SPRAS](#creating-an-apptainer-image-for-spras)
for instructions.
**Note:** It is best practice to make sure that the Snakefile you copy for your workflow is the same version as the Snakefile baked into your workflow's container image. When this workflow runs, the Snakefile you just copied will be used during remote execution instead of the Snakefile from the container. As a result, difficult-to-diagnose versioning issues may occur if the version of SPRAS in the remote container doesn't support the Snakefile on your current branch. The safest bet is always to create your own image so you always know what's inside of it.

To start the workflow with HTCondor in the CHTC pool, run:
```bash
Expand Down Expand Up @@ -123,15 +137,6 @@ contain useful debugging clues about what may have gone wrong.
the version of SPRAS you want to test, and push the image to your image repository. To use that container in the workflow, change the `container_image` line of
`spras.sub` to point to the new image.

## Creating an Apptainer image for SPRAS

In some cases, especially if you're encountering an error like `/srv//spras.sh: line 10: snakemake: command not found`, it may be necessary to convert
the SPRAS image to a `.sif` container image before running someplace like the OSPool. To do this, run:
```bash
apptainer build spras.sif docker://reedcompbio/spras:v0.1.0
```
to produce the file `spras.sif`. Then, substitute this value as the `container_image` in the submit file.

## Versions:

The versions of this image match the version of the spras package within it.
Expand Down
15 changes: 8 additions & 7 deletions docker-wrappers/SPRAS/spras.sub
Original file line number Diff line number Diff line change
Expand Up @@ -13,15 +13,16 @@ SNAKEFILE = ../../Snakefile

############################################################
# Specify that the workflow should run in the SPRAS #
# container. In the OSPool, this image is usually #
# converted automatically to an Apptainer/Singularity #
# image, which is why the example config has #
# `unpack_singularity = true`. #
# container. You can either use a docker:// URL, or point #
# directly to an Apptainer image (recommended). Note that #
# if running in the OSPool, most docker images are first #
# automatically converted to Apptainer issues, but it's #
# generally recommended that you build your own image #
# first #
############################################################
universe = container
container_image = docker://reedcompbio/spras:v0.2.0
# container_image = spras.sif

container_image = <your spras image>.sif
# container_image = docker://reedcompbio/spras:v0.2.0

############################################################
# Specify names for log/stdout/stderr files generated by #
Expand Down

0 comments on commit 2655d1b

Please sign in to comment.