diff --git a/docker-wrappers/SPRAS/README.md b/docker-wrappers/SPRAS/README.md index 6187a553..1aed9754 100644 --- a/docker-wrappers/SPRAS/README.md +++ b/docker-wrappers/SPRAS/README.md @@ -7,12 +7,12 @@ This image comes bundled with all of the necessary software packages to run SPRA To create the Docker image, make sure you are in this repository's root directory, and from your terminal run: -``` +```bash docker build -t /: -f docker-wrappers/SPRAS/Dockerfile . ``` For example, to build this image with the intent of pushing it to DockerHub as reedcompbio/spras:v0.1.0, you'd run: -``` +```bash docker build -t reedcompbio/spras:v0.1.0 -f docker-wrappers/SPRAS/Dockerfile . ``` @@ -21,7 +21,7 @@ is being installed with `pip`, it's also possible to specify that you want devel spras package that receives changes without re-installation, change the `pip` installation line to: -``` +```bash pip install -e .[dev] ``` @@ -48,16 +48,59 @@ DOCKER_DEFAULT_PLATFORM=linux/amd64 docker build -t reedcompbio/spras:v0.1.0 -f The folder `docker-wrappers/SPRAS` also contains several files that can be used to test this container on HTCondor. To test the `spras` container in this environment, first login to an HTCondor Access Point (AP). Then, from the AP clone this repo: -``` +```bash git clone https://github.com/Reed-CompBio/spras.git ``` -When you're ready to run SPRAS as an HTCondor workflow, navigate to the `spras/docker-wrappers/SPRAS` directory and create the `logs/` directory. Then run -`condor_submit spras.sub`, which will submit SPRAS to HTCondor as a single job with as many cores as indicated by the `NUM_PROCS` line in `spras.sub`, using -the value of `EXAMPLE_CONFIG` as the SPRAS configuration file. Note that you can alter the configuration file to test various workflows, but you should leave -`unpack_singularity = true`, or it is likely the job will be unsuccessful. By default, the `example_config.yaml` runs everything except for `cytoscape`, which -appears to fail periodically in HTCondor. +There are currently two options for running SPRAS with HTCondor. The first is to submit all SPRAS jobs to a single remote Execution Point (EP). The second +is to use the the snakemake HTCondor executor to parallelize the workflow by submitting each job to its own EP. + +### Submitting All Jobs to a Single EP + +Navigate to the `spras/docker-wrappers/SPRAS` directory and create the `logs/` directory. Then run `condor_submit spras.sub`, which will submit SPRAS +to HTCondor as a single job with as many cores as indicated by the `NUM_PROCS` line in `spras.sub`, using the value of `EXAMPLE_CONFIG` as the SPRAS +configuration file. Note that you can alter the configuration file to test various workflows, but you should leave `unpack_singularity = true`, or it +is likely the job will be unsuccessful. By default, the `example_config.yaml` runs everything except for `cytoscape`, which appears to fail periodically +in HTCondor. +**Note**: The `spras.sub` submit file is an example of how this workflow could be submitted from a CHTC Access Point (AP) to the OSPool. To run in the local +CHTC pool, omit the `+WantGlideIn` and `requirements` lines + +### Submitting Parallel Jobs + +Parallelizing SPRAS workflows with HTCondor currently requires an experimental executor for HTCondor that has been forked from the upstream [HTCondor Snakemake executor](https://github.com/jhiemstrawisc/snakemake-executor-plugin-htcondor/tree/spras-feature-dev). +To get this executor, clone the forked repository using the following: +```bash +git clone -b spras-feature-dev https://github.com/jhiemstrawisc/snakemake-executor-plugin-htcondor.git +``` + +Then, from your activated `spras` conda environment (important), run: +```bash +pip install snakemake-executor-plugin-htcondor/ +``` + +Currently, this executor requires that all input to the workflow is scoped to the current working directory. Therefore, you'll need to copy the +Snakefile and your input directory (as specified by `example_config.yaml`) to this directory: +```bash +cp ../../Snakefile . && \ +cp -r ../../input . +``` + +It's also necessary for this workflow to create an Apptainer image from the published Docker image. See [Creating an Apptainer image for SPRAS](#creating-an-apptainer-image-for-spras) +for instructions. + +To start the workflow with HTCondor, run: +```bash +snakemake --profile spras_profile +``` + +Resource requirements can be adjusted as needed in `spras_profile/config.yaml`, and HTCondor logs for this workflow can be found in `.snakemake/htcondor`. +You can set a different log directory by adding `htcondor-jobdir: /path/to/dir` to the profile's configuration. + +**Note**: This workflow requires that the terminal session responsible for running snakemake stays active. Closing the terminal will suspend jobs, +but the workflow can use Snakemakes checkpointing to pick up any jobs where they left off. + +### Job Monitoring To monitor the state of the job, you can run `condor_q` for a snapshot of how the job is doing, or you can run `condor_watch_q` if you want realtime updates. Upon completion, the `output` directory from the workflow should be returned as `spras/docker-wrappers/SPRAS/output`, along with several files containing the workflow's logging information (anything that matches `logs/spras_*` and ending in `.out`, `.err`, or `.log`). If the job was unsuccessful, these files should @@ -67,9 +110,11 @@ contain useful debugging clues about what may have gone wrong. the version of SPRAS you want to test, and push the image to your image repository. To use that container in the workflow, change the `container_image` line of `spras.sub` to point to the new image. -**Note**: In some cases, especially if you're encountering an error like `/srv//spras.sh: line 10: snakemake: command not found`, it may be necessary to convert +## Creating an Apptainer image for SPRAS + +In some cases, especially if you're encountering an error like `/srv//spras.sh: line 10: snakemake: command not found`, it may be necessary to convert the SPRAS image to a `.sif` container image before running someplace like the OSPool. To do this, run: -``` +```bash apptainer build spras.sif docker://reedcompbio/spras:v0.1.0 ``` to produce the file `spras.sif`. Then, substitute this value as the `container_image` in the submit file. diff --git a/docker-wrappers/SPRAS/spras_profile/config.yaml b/docker-wrappers/SPRAS/spras_profile/config.yaml new file mode 100644 index 00000000..3d72043f --- /dev/null +++ b/docker-wrappers/SPRAS/spras_profile/config.yaml @@ -0,0 +1,11 @@ +jobs: 30 +executor: htcondor +configfile: example_config.yaml +shared-fs-usage: none +default-resources: + job_wrapper: "spras.sh" + # If running in CHTC, this only works with apptainer images + container_image: "spras.sif" + universe: "container" + request_disk: "16GB" + request_memory: "8GB" diff --git a/environment.yml b/environment.yml index bcbb69c0..47ae801d 100644 --- a/environment.yml +++ b/environment.yml @@ -3,7 +3,7 @@ channels: - conda-forge dependencies: - adjusttext=0.7.3.1 - - bioconda::snakemake-minimal=8.11.6 + - bioconda::snakemake-minimal=8.16.0 - docker-py=5.0 - matplotlib=3.6 - networkx=2.8 @@ -27,3 +27,6 @@ dependencies: - pip: - graphspace_python==1.3.1 - sphinx-rtd-theme==2.0.0 + # This installs the current directory as an editable module + # Needed if running spras from another location + - -e . diff --git a/pyproject.toml b/pyproject.toml index d19a5988..def29b28 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -20,7 +20,7 @@ requires-python = ">=3.11" dependencies = [ "adjusttext==0.7.3", # A bug was introduced in older versions of snakemake that prevent it from running. Update to fix - "snakemake==8.11.6", + "snakemake==8.16.0", "docker==5.0.3", # Switched from docker-py to docker because docker-py is not maintained in pypi. This appears to have no effect "matplotlib==3.6", "networkx==2.8",