-
Notifications
You must be signed in to change notification settings - Fork 0
PM_WGS_pipeline
cd /xdisk/bhurwitz/mig2020/rsgrps/bhurwitz/kai/planet-microbe-functional-annotation/
modify the cluster.yml and config.yml files
add out
and err
directories
Change run_snakemake.sh
pm_env to snakemake
and path for cd to my version.
copied matt's verison's bowtie.simg
to my singlarity
dir
bowtie index folder is in: /xdisk/bhurwitz/mig2020/rsgrps/bhurwitz/planet-microbe-functional-annotation/data copy that into my version of the git repo so that the whole thing is portable
bash/run_start_lookup_server.sh
has the wall time for the interpro server this was only at 12 hence why thing were failing. Matt will make some fixes and I'll reclone the repo and start it again. Making sure that jobs are finishing correctly. I can also use elgato windfall to potentially get lots of nodes.
submit the main snakemake job which will submit other jobs need to make sure this isnt' submitting too many
sh submit_snakemake.sh
// New version submisson Can change JOBNAME
for batches.
sbatch run_snakemake.sh
//submission for old single-threaded version
//old sbatch submit_snakemake.sh
//submission for old multi-threaded version
squeue -u kblumberg
scancel job_ID_number
scancel -u kblumberg
//cancel all my jobs
va
// shows allocation remaining on HPC.
uquota
shows quota for group
du -sh kai/
size of directory
scp [email protected]:/xdisk/bhurwitz/mig2020/rsgrps/bhurwitz/kai/planet-microbe-functional-annotation/bash/check_qc.sh .
#SBATCH --account=bhurwitz
#SBATCH --partition=standard
#SBATCH --partition=windfall
in results/completed/
rm */step_05_chunk_reads/ -r
rm */step_06_get_orfs/ -r
iinit
standard command to get irods started doesn't work on my head node but does in interactive
Command to copy files over from cyverse to my data directory.
iget -PT /iplant/home/shared/planetmicrobe/sra/SRR4831663.fastq.gz /xdisk/bhurwitz/mig2020/rsgrps/bhurwitz/kai/planet-microbe-functional-annotation/data
https://public.confluence.arizona.edu/display/UAHPC/HPC+Documentation
https://public.confluence.arizona.edu/display/UAHPC/Puma+Quick+Start
SRR5002308
SRR5002313
SRR9178237_1
SRR5002388
SRR5720237_1
SRR9178091_1
SRR5720251_1
SRR5720282_1
SRR5720279_1
SRR5002397
SRR5720342_1
SRR9178489_1
SRR5720276_1
SRR9178197_1
SRR5720233_1
SRR5720337_1
SRR5002331
SRR5720231_1
SRR5002401
SRR9178082_1
SRR9178356_1
SRR9178101_1
SRR5002349
SRR9178501_1
SRR9178319_1
SRR9178098_1
SRR9178375_1
SRR5002378
SRR5002337
SRR9178233_1
SRR9178483_1
SRR9178089_1
SRR5002311
SRR5002344
SRR9178407_1
SRR9178147_1
SRR9178359_1
SRR9178281_1
SRR5002319
SRR9178156_1
SRR9178118_1
- Check the slurm output file and see which rule crashed
- Check the error file and see if there's useful information about the crash
- Check the log file (found at e.g. results/SRR4831664/step_01_trimming/log) for specific information about the running of that step's executable.
interactive
source ~/.bashrc
conda env create -f kraken2.yml
conda env create -f bracken.yml
conda env create -f pm_env.yml // this failed make a new pm_env.yml with snakemake
# steps to create pm_env again do this in interactive
conda create -n pm_env
conda activate pm_env
conda install -n base -c conda-forge mamba
mamba create -c conda-forge -c bioconda -n snakemake snakemake
conda install -n base -c conda-forge mamba
install mamba to install snakemake
mamba create -c conda-forge -c bioconda -n snakemake snakemake
install snakemake this made a new conda environment called snakemake
conda install -c conda-forge biopython
added bioptyon
conda install -c anaconda java-1.7.0-openjdk-cos6-x86_64
also added java from here and here. this didn't stay after loging back in didn't get added to PATH? Try conda install -c conda-forge openjdk
from here
interpro docs https://interproscan-docs.readthedocs.io/en/latest/UserDocs.html bash script
/groups/bhurwitz/tools/interproscan-5.46-81.0/interproscan.sh -appl Pfam -i results/SRR4831664/step_05_chunk_reads/SRR4831664_trimmed_qcd_frags_2047.faa -b results/SRR4831664/step_06_get_orfs/SRR4831664_trimmed_qcd_frags_2047_interpro -goterms -iprlookup -dra -cpu 4
after installing java 11 this works but I'm still getting the log for step 6 saying
Java version 11 is required to run InterProScan. Detected version 1.8.0_292 Please install the correct version.
Changed the Snakemake interproscan rule to activate the snakemake conda env
In the directory job_runs/snakefile_versions
we have the files regular_Snakefile
to do step 7 and Snakefile_upto_step4
to do until step 4.
From /xdisk/bhurwitz/mig2020/rsgrps/bhurwitz/kai/planet-microbe-functional-annotation
:
cp job_runs/snakefile_versions/Snakefile_upto_step4 Snakefile
cp job_runs/snakefile_versions/regular_Snakefile Snakefile
When running the Snakefile_upto_step4
make sure to up the time to 48 hours in run_snakemake.sh
and back to 24 to do the regular_Snakefile
. Same with the cluster.yml
file.
ran 23 samples (1 done in previous testing) for 48 hours all but 3 finished. (Some were <1G). Not using the multi threaded version.
ran 24 samples for nearly 48 hours snakejob ended but not all finished. The log file said the job had timed out and there were jobs remaining but no snakemake job. So I canceled the snakejobs and resubmitted. Not using the multi threaded version.
First job ran the smallest 24 samples. First time with shell parameter mistake.
when testing the multi threaded version a single job submission sent out (at least at the time I captured it) 117 of the multi threaded ips_0 nodes doing the interproscan step. Didn't have the interproscan working again because I pulled Matt's version of the Snakemakefile without the following shell parameters, which I added back in and re-ran the job for rule run_pipeline
. Might need to also add it to rule interproscan
.
bash -c '
. $HOME/.bashrc
conda activate snakemake