Skip to content

Commit

Permalink
updated documentation and scripts
Browse files Browse the repository at this point in the history
  • Loading branch information
vpbrendel committed Mar 22, 2022
1 parent 52b9958 commit d11f590
Show file tree
Hide file tree
Showing 44 changed files with 14,497 additions and 353 deletions.
309 changes: 309 additions & 0 deletions AllRice.def
Original file line number Diff line number Diff line change
@@ -0,0 +1,309 @@
bootstrap: docker
From: fedora:35

%help
This container provides working software versions for the AllRice project.

%post
dnf -y update
dnf -y install bc bzip2 findutils git lftp mlocate tcsh unzip zip wget which
dnf -y install gcc-c++ make automake cmake ruby
dnf -y install glibc-static
dnf -y install boost boost-devel
dnf -y install cairo-devel pango-devel zlib-devel
dnf -y install libnsl
dnf -y install python python-biopython python3-wheel python-wheel-wheel python3-virtualenv python3-pip
dnf -y install python3-h5py python3-devel
dnf -y install pandoc
dnf -y install java-11-openjdk java-11-openjdk-devel ant
dnf -y install perl-App-cpanminus
dnf -y install curl libcurl libcurl-devel ncurses ncurses-devel
dnf -y install openssl openssl-devel


### Read quality control

echo 'Installing FastQC from http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ '
cd /opt
wget http://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.9.zip
cpanm install FindBin
unzip fastqc_v0.11.9.zip
chmod +x FastQC/fastqc
\rm fastqc_v0.11.9.zip

echo 'Installing Trimmomatic from http://www.usadellab.org/cms/index.php?page=trimmomatic '
cd /opt
wget http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/Trimmomatic-0.39.zip
unzip Trimmomatic-0.39.zip
\rm Trimmomatic-0.39.zip
# To run Trimmomatic, use the following alias on the host:
#
#alias trimmomatic="singularity exec -e -B${PWD} AllRice.sif java -Xmx16G -jar /opt/Trimmomatic-0.39/trimmomatic-0.39.jar"


### Read manipulation

echo 'Installing SRAtoolkit from http://www.ncbi.nlm.nih.gov/books/NBK158900/ '
cd /opt
wget http://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/2.11.3/sratoolkit.2.11.3-ubuntu64.tar.gz
tar -xzf sratoolkit.2.11.3-ubuntu64.tar.gz
\rm sratoolkit.2.11.3-ubuntu64.tar.gz

echo 'Installing UMI-tools from https://github.com/CGATOxford/UMI-tools '
pip3 install --upgrade umi-tools

echo 'Installing HTSlib from http://www.htslib.org/ '
cd /opt
git clone https://github.com/samtools/htslib.git htslib
cd htslib
git submodule update --init --recursive
make && make install && make clean

echo 'Installing Samtools from http://www.htslib.org/ '
cd /opt
git clone https://github.com/samtools/samtools.git samtools
cd samtools
make && make install && make clean

echo 'Installing SeqKit from https://github.com/shenwei356/seqkit '
cd /opt
wget https://github.com/shenwei356/seqkit/releases/download/v2.1.0/seqkit_linux_amd64.tar.gz
tar -xzf seqkit_linux_amd64.tar.gz
chmod a+x seqkit
mv seqkit /usr/local/bin
\rm seqkit_linux_amd64.tar.gz


### Genome assembly tools:

echo 'Installing Flye from https://github.com/fenderglass/Flye '
cd /opt
git clone https://github.com/fenderglass/Flye.git
cd Flye/
python setup.py install

echo 'Installing Pilon, described at https://github.com/broadinstitute/pilon/wiki '
cd /opt
wget https://github.com/broadinstitute/pilon/releases/download/v1.24/pilon-1.24.jar
# To run pilon, use the following alias on the host (adjust memory as needed):
#
#alias pilon="singularity exec -e -B${PWD} AllRice.sif java -Xmx16G -jar /opt/pilon-1.24.jar"

echo 'Installing MaSuRCA from https://github.com/alekseyzimin/masurca '
cd /opt
wget https://github.com/alekseyzimin/masurca/releases/download/v4.0.7/MaSuRCA-4.0.7.tar.gz
tar -xzf MaSuRCA-4.0.7.tar.gz
cd MaSuRCA-4.0.7
sed -i -e "\$d" install.sh
\rm -rf Flye/
cd global-1/
sed -i -e "s#jellyfish samtools prepare ufasta quorum SuperReads SOAPdenovo2 PacBio MUMmer CA8 eugene#jellyfish ufasta quorum SuperReads MUMmer#" Makefile.am
autoreconf
cd ..
./install.sh

echo 'Installing Wengan from https://github.com/adigenova/wengan '
cd /opt
wget https://github.com/adigenova/wengan/releases/download/v0.2/wengan-v0.2-bin-Linux.tar.gz
tar -xzf wengan-v0.2-bin-Linux.tar.gz
chmod a+x /opt/wengan-v0.2-bin-Linux/wengan.pl


### Read mapping

echo 'Installing BWA '
cd /opt
git clone https://github.com/lh3/bwa.git
cd bwa
make
cp bwa /usr/local/bin/

echo 'Installing minimap2 from https://github.com/lh3/minimap2 '
cd /opt
wget https://github.com/lh3/minimap2/releases/download/v2.24/minimap2-2.24_x64-linux.tar.bz2
tar -jxvf minimap2-2.24_x64-linux.tar.bz2
cp minimap2-2.24_x64-linux/minimap2 /usr/local/bin

echo 'Installing Bowtie2 version 2.4.5 '
cd /opt
wget https://github.com/BenLangmead/bowtie2/releases/download/v2.4.5/bowtie2-2.4.5-linux-x86_64.zip
unzip bowtie2-2.4.5-linux-x86_64.zip
\rm bowtie2-2.4.5-linux-x86_64.zip

echo 'Installing hisat2 '
cd /opt
wget https://cloud.biohpc.swmed.edu/index.php/s/oTtGWbWjaxsQ2Ho/download
mv download hisat2-2.2.1.zip
unzip hisat2-2.2.1.zip
\rm hisat2-2.2.1.zip

echo 'Installing STAR from https://github.com/alexdobin/STAR '
cd /opt
git clone https://github.com/alexdobin/STAR


### Scaffolding tools

echo 'Installing RagTag '
cd /opt
git clone https://github.com/malonge/RagTag
cd RagTag
python3 setup.py install


### Repeat masking tools

echo 'Installing RMBlast from http://www.repeatmasker.org/ '
cd /opt
wget http://www.repeatmasker.org/rmblast-2.10.0+-x64-linux.tar.gz
tar -xzf rmblast-2.10.0+-x64-linux.tar.gz
\rm rmblast-2.10.0+-x64-linux.tar.gz

echo 'Installing TRF '
cd /opt
git clone https://github.com/Benson-Genomics-Lab/TRF.git
cd TRF
mkdir build && cd build
../configure
make
make install
make clean
cd ../..

echo 'Installing RepeatMasker from http://www.repeatmasker.org/ '
cd /opt
wget http://repeatmasker.org/RepeatMasker/RepeatMasker-4.1.1.tar.gz
tar -xzf RepeatMasker-4.1.1.tar.gz
echo "/usr/local/bin/trf" > rmcnf
echo "2" >> rmcnf
echo "/opt/rmblast-2.10.0/bin" >> rmcnf
echo "Y" >> rmcnf
echo "5" >> rmcnf
cd RepeatMasker
perl ./configure < ../rmcnf
cd Libraries
wget https://github.com/BrendelGroup/AllRice/raw/main/data/RITE-12-10-2020.tgz
tar -xzf RITE-12-10-2020.tgz
\rm RITE-12-10-2020.tgz
# ... obtained by download from https://www.genome.arizona.edu/cgi-bin/rite/index.cgi (selecting all Oryza entries)
cd ../..
\rm RepeatMasker-4.1.1.tar.gz rmcnf


### R, EBSeq, DESeq2, RSEM

echo 'Installing R '
cd /opt
dnf -y install R
echo 'repo <- "http://ftp.ussg.iu.edu/CRAN"' > R2install
echo 'install.packages("pafr", repos = repo)' >> R2install
echo 'install.packages("BiocManager", repos = repo)' >> R2install
echo 'BiocManager::install(c("EBSeq","DESeq2","R2HTML"), ask=FALSE)' >> R2install
Rscript R2install

echo 'Installing RSEM '
cd /opt
git clone https://github.com/deweylab/RSEM
cd RSEM/
# ... we do not want the unaligned reads showing up in the BAM file when using bowtie2, thus:
sed -i -e "s#--dpad 0#--no-unal --dpad 0#g" rsem-calculate-expression
make
make ebseq


### Alignment software

echo 'Installing BLAST+ version 2.12.0 from NCBI '
cd /opt
wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.12.0/ncbi-blast-2.12.0+-x64-linux.tar.gz
tar -xzf ncbi-blast-2.12.0+-x64-linux.tar.gz
cd ncbi-blast-2.12.0+/bin
cp * /usr/local/bin/
cd ../..
rm ncbi-blast-2.12.0+-x64-linux.tar.gz
cd ..

echo 'Installing MuSeqBox version 5.7 from BrendelGroup '
cd /opt
wget http://www.brendelgroup.org/bioinformatics2go/Download/MuSeqBox-5-7-2021.tar.gz
tar -xzf MuSeqBox-5-7-2021.tar.gz
cd MUSEQBOX5.7/src/
make linux
make install
make clean
cd ../..

echo 'Installing Vmatch aligner version 2.3.1 from http://vmatch.de '
#### Prerequisites
#### Install
cd /opt
wget http://vmatch.de/distributions/vmatch-2.3.1-Linux_x86_64-64bit.tar.gz
tar -xzf vmatch-2.3.1-Linux_x86_64-64bit.tar.gz
rm vmatch-2.3.1-Linux_x86_64-64bit.tar.gz
ln -s vmatch-2.3.1-Linux_x86_64-64bit VMATCH

echo 'Installing GenomeThreader version 1.7.3 spliced aligner '
cd /opt
wget http://genomethreader.org/distributions/gth-1.7.3-Linux_x86_64-64bit.tar.gz
tar -xzf gth-1.7.3-Linux_x86_64-64bit.tar.gz
rm gth-1.7.3-Linux_x86_64-64bit.tar.gz
ln -s gth-1.7.3-Linux_x86_64-64bit GENOMETHREADER


### Utilities

echo 'Installing the GenomeTools package: '
cd /opt
git clone https://github.com/genometools/genometools.git
cd genometools
make
make install
make clean
sh -c 'echo "/usr/local/lib" > /etc/ld.so.conf.d/genometools-x86_64.conf'
ldconfig
cd ..

echo 'Installing ngsutilsj from https://github.com/compgen-io/ngsutilsj.git '
cd /opt
git clone https://github.com/compgen-io/ngsutilsj
cd ngsutilsj
ant jar
ln -s /opt/ngsutilsj/dist/ngsutilsj /usr/local/bin/ngsutilsj
cd ../..

echo 'Installing assembly-stats from https://github.com/sanger-pathogens/assembly-stats '
cd /opt
git clone https://github.com/sanger-pathogens/assembly-stats.git
cd assembly-stats/
mkdir build
cd build
cmake ..
make
make install

echo 'Installing the AllRice package: '
cd /opt
git clone https://github.com/BrendelGroup/AllRice
cd ..


%environment
export LC_ALL=C
export PATH=$PATH:/opt/FastQC
export PATH=$PATH:/opt/sratoolkit.2.11.3-ubuntu64/bin
export PATH=$PATH:/opt/MaSuRCA-4.0.7/bin
export PATH=$PATH:/opt/wengan-v0.2-bin-Linux:/opt/wengan-v0.2-bin-Linux/bin
export PATH=$PATH:/opt/bowtie2-2.4.5-linux-x86_64
export PATH=$PATH:/opt/hisat2-2.2.1
export PATH=$PATH:/opt/STAR/bin/Linux_x86_64
export PATH=$PATH:/opt/RSEM
export PATH=$PATH:/opt/GENOMETHREADER/bin
export PATH=$PATH:/opt/RepeatMasker
export PATH=$PATH:/opt/AllRice/bin
export PATH=$PATH:/opt/VMATCH
export BSSMDIR="/opt/GENOMETHREADER/bin/bssm"
export GTHDATADIR="/opt/GENOMETHREADER/bin/gthdata"

%labels
Maintainer vpbrendel
Version v1.2
File renamed without changes.
50 changes: 36 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,53 @@
# AllRice
Code for rice genome analyses (but not necessarily only rice)

## Quick Start [![https://www.singularity-hub.org/static/img/hosted-singularity--hub-%23e32929.svg](https://www.singularity-hub.org/static/img/hosted-singularity--hub-%23e32929.svg)](https://singularity-hub.org/collections/5032)
## Quick Start

The simplest way to get going is to use the AllRice
[Singularity](https://apptainer.org/) container available from our
[Singularity Hub](http://BrendelGroup.org/SingularityHub/) site; e.g.:

The easiest way to access the code is via the
[Singularity](https://www.sylabs.io/docs/) containers available from
[Singularity Hub](https://singularity-hub.org/).
Thus, once you know what you are doing, execution could be as simple as

```bash
singularity pull --name AllRice.sif shub://BrendelGroup/AllRice:latest
singularity pull --name DDN.sif shub://BrendelGroup/AllRice:ddn
cd
git clone https://github.com/BrendelGroup/AllRice
cd AllRice
wget https://BrendelGroup.org/SingularityHub/AllRice.sif
alias rws="singularity exec -e -B ~/AllRice/prj/LHRWBY2021 ~/AllRice/AllRice.sif"
rws ngsutilsj help
rws umi_tools --help
rws flye --help
wget https://BrendelGroup.org/SingularityHub/DDN.sif
singularity exec -e -B ~/AllRice/prj/LHRWBY2021 ~/AllRice/DDN.sif DiscovarDeNovo -h
```

which provides, for example:
In the above example, you clone this repository into your Linux home directory,
go into the thus created AllRice directory, download the AllRice Singularity
container, define the bash alias _rws_ ("run with singularity"), and check that
everything works by showing help information on some of the installed programs.
You then also download the DDN.sif image and show that you have the DiscoverDeNovo
assembler accessible.
For a listing of included software, see our [CATALOG](https://github.com/BrendelGroup/bgSingularityHub/blob/main/CATALOG.md).

Of course this assumes that you have [Apptainer/Singularity](https://apptainer.org/) installed on your system.
Check whether there is package built for your system.
Otherwise, follow the instructions to [install Singularity from source code](https://apptainer.org/user-docs/master/quick_start.html#quick-installation-steps).

```bash
singularity exec -e -B ${PWD} AllRice.sif ngsutilsj help
singularity exec -e -B ${PWD} AllRice.sif umi_tools --help
singularity exec -e -B ${PWD} DDN.sif DiscovarDeNovo -h
```

## Projects

* [LHRWBY2021](./prj/LHRWBY2021) Computational analyses discussed in

__Dangping Luo, Jose C. Huguet-Tapia, R. Taylor Raborn, Frank F. White, Volker P. Brendel & Bing Yang (2021)__
__Luo, D., Huguet-Tapia, J.C., Raborn, R.T., White, F.F., Brendel, V.P. & Yang, B. (2021)__
_The Xa7 resistance gene guards the rice susceptibility gene SWEET14 against exploitation by the bacterial blight pathogen._
[Plant Communications 2, 100164](https://www.sciencedirect.com/science/article/pii/S2590346221000390).

Follow the instructions in [0README](https://github.com/BrendelGroup/AllRice/blob/main/prj/LHRWBY2021/0README)
to reproduce the work (or use the workflow as a template for analysis of your own data sets).


## Contact

Please direct all comments and suggestions to
[Volker Brendel](<mailto:[email protected]>)
at [Indiana University](http://brendelgroup.org/).
Loading

0 comments on commit d11f590

Please sign in to comment.