GJSrMap: The smallRNA mapping pipeline

Overview:

An open source and fully customized pipeline that:

Maps all small non-coding RNAs to customized and comprehensive reference sequences
Is modularized and iterative
Run on an HPCC cluster (default) or on a computer/server
Performs quality control on the data
Provides a detailed summary of mapping:
- FastQC plots for every iteration
- MultiQC plots for every iteration
- Summary plots and statistics of smallRNA distribution and abundance
Provides raw and normalized (RPKM) counts
Detailed logs of every iteration and steps

Description

gjsrmap: SmallRNA mapping and analysis pipeline schematic:
- Pre-processing of the sequences:
  - This is the iteration 0 in the above schematic diagram
  - Preprocessing of the sequences to avoid multi mapping of the reads
  - Build custom reference sequence indexes
  - Removal of low quality reads
  - Removal of 3' adapter sequences and size reduction of the reads
- Iterative mapping of processed and filtered reads:
  - Iteration 1: Map reads between 16 to 33 bp to custom reference sequences of mature microRNAs and piRNAs
  - Iteration 2: Map reads greater than 32 bp to custom reference sequences of other small non-coding RNAs. These are:
```
| Other small non-coding RNAs | Description                                          |
|-----------------------------|------------------------------------------------------|
| rRNA                        | Ribosomal RNA                                        |
| scRNA                       | Small cytoplasmic RNA                                |
| snRNA                       | Small nuclear RNA                                    |
| snoRNA                      | Small nucleolar RNA                                  |
| premiRNA                    | microRNA precursors                                  |
| osncRNA                     | Other small noncoding RNA                            |
| - tRNA                      | - Transfer RNA                                       |
| - Mt-tRNA                   | - Transfer RNA located   in the mitochondrial genome |
| - misc_RNA                  | - Miscellaneous other RNA                            |
```
  - Iteration 3: Map the unmapped reads from iteration 1 and 2 to the species reference genome
- Count the reads and distribute them to individual smallRNA classes
- Generate QC, mapping and summary report
  - Sequence quality information
  - Bar plot of library sizes
  - Small non-coding RNA reads distrubution
  - Profile of expressed small non-coding RNAs (miRNAs in the above figure). Plots are also generated for other classes as well

Dependencies:

bedtools
bowtie
bowtie2
cutadapt
fastqc
matplotlib
multiqc
numpy
samtools
scipy

Usage:

Run wrapper with following options:

SPC=${1}                                            # Species: hsa or mmu or some other species
IFD=${2}                                            # Input Fastq Dir: input/fastq/test
ORD=${3}                                            # Output Results Dir: output/test
BWD=${4}                                            # path/to/bowtie/indexes
QUE=${5:-"fat"}                                     # mpi, fat, mpi-short, fat-short, mpi-long, fat-long
SPK=${6:-""}                                        # exiseq_spikein_dna_unique.fa or spike_rna1_unique.fa
threePadapter=${7:-"TGGAATTCTCGGGTGCCAAGG"}         # trueseq adapter
JID=${8:-"$(echo $HOME)/gjsrmap"}                   # Job dir
NCL=${9:-"input/annotation/rna_classes"}            # ncrna folder containing ncrna class fasta

Example command:
bash 06_run_ncRNA_mapping_usage.sh <above mentioned arguments>

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
input		input
scripts		scripts
test		test
.gitignore		.gitignore
00_gjsrmap_usage.sh		00_gjsrmap_usage.sh
02.1_map_pimirna_bowtie_usage.sh		02.1_map_pimirna_bowtie_usage.sh
02.2_map_smncrna_bowtie_usage.sh		02.2_map_smncrna_bowtie_usage.sh
03_map_reads_to_genome_bowtie_usage.sh		03_map_reads_to_genome_bowtie_usage.sh
04_counts_summary_usage.sh		04_counts_summary_usage.sh
05_cleanup_usage.sh		05_cleanup_usage.sh
06_run_ncrna_mapping_usage.sh		06_run_ncrna_mapping_usage.sh
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GJSrMap: The smallRNA mapping pipeline

Overview:

Description

Dependencies:

Usage:

About

Releases 1

Packages

Languages

gauravj49/gjsrmap

Folders and files

Latest commit

History

Repository files navigation

GJSrMap: The smallRNA mapping pipeline

Overview:

Description

Dependencies:

Usage:

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages