RNAseq README

RNAseq README

Flowchart

Pipeline overview

Starting with fastQ files as input, this pipeline has the following stages:

Pre-alignment QC with fastp (adapter-trimming + QC) and multiQC report collation
Alignment with STAR
Post-alignment QC and analysis with:

Infer library strandedness with RSeQC inferstrandedness.py
Outputs Transcript Integrity number (TIN) with RSeQC tin.py for evenness of coverage assessment
Outputs wig and normalised bigwig files for genome browser visualisations with RSeQC bam2wig and UCSC wig2BigWig binary

Quantification with FeatureCounts from Subread at the exon-level, grouped by gene_id into metafeatures.
[Conditional] If there are >=2 sample conditions and >=3 samples, we carry out the R secondary analysis which performs:

Differential gene expression across all pairwise comparisons
Overrepresentation analysis for pathway enrichment

Outputs

Name	Type	Extension	Description
fastpQcHtml	pre-alignment qc	.html files	fastp html reports
multiqcHtml	pre-alignment qc	.html file	multiqc html report
bamFiles	alignment results	.bam files	STAR sorted-by-coordinate bams
starLogs	post-alignment qc	log files	STAR log.final.out files
flagstat	post-alignment qc	log files	flagstat files
tin_summary	post-alignment qc	.txt files	rseqc tin summary files
tin_xls	post-alignment qc	.xls files	rseqc tin xls files
wigs	alignment results	.wig files	wig files
normalised bigwigs	alignment results	.bigwig files	bigwig files
countMatrix	quantification results	.txt	featureCounts count matrix
countsParsed	quantification results	.txt	Output of parse_counts.py which provides gene names
countsSummary	quantification results	.txt	featureCounts summary file
downstreamResDir	R secondary analysis results	.zip	R analysis results files

Docker images

Naming convention [Documentation]

dockerBase = 026171442599.dkr.ecr.ap-southeast-1.amazonaws.com/
dockerPrefix = species + genomeVersion
- E.g. “human” + “grch38”
dockerUri = dockerBase + dockerPrefix

Modules and images

For other species, replace docker image according to naming convention

Index	Module	Docker image name	Scripts/apps
1	countArrayUniqueItems.wdl	ubuntu	None
2	fastp.wdl	qc (apps_refless)	None
3	multiQc.wdl	qc (apps_refless)	None
4	pairsToR1R2.wdl	None	None
5	star.wdl	humangrch38rnaseq (fat Docker)	(ref_files/species): STAR indices, annotation.gtf
6	samtools.wdl	samtools (apps_refless)	None
7	rseqc.wdl	humangrch38rseqc (apps)	(ref_files/species): annotation.bed, houseKeepingGenes.bed, chrNameLength.txt; (ref_files/apps): wigToBigWig; (apps): processInferExperimentScript.py
8	featureCounts.wdl	humangrch38featureCounts (apps)	(ref_files/species): annotation.gtf; (apps) parse_counts.py
9	downstreamRNAseq.wdl	downstreamRNAseq (apps_refless)	(apps): main.R

Gitlab repositories

Main: RNAseq repository
Dockerfiles and configurations
Modules (checkout rnaseq branch)

Biodebian organisation

See gedac documentation

Submitting jobs to the cromwell server on biodebian

The options.json file is a dummy file required for running the cromshell submit subcommand. Currently, cromwell has the following IP: 172.18.149.93:7098.

cromshell submit ./main.wdl ./tests/tests_biodebian/rnaseq_mouse_grcm39.json ./options.json ./modules.zip

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
modules @ 127d4af		modules @ 127d4af
modules_rnaseq		modules_rnaseq
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
build.json		build.json
form_template.json		form_template.json
input_structure.json		input_structure.json
main.wdl		main.wdl
options.json		options.json
output_structure.json		output_structure.json
testConfig.json		testConfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RNAseq README

Flowchart

Pipeline overview

Outputs

Docker images

Naming convention [Documentation]

Modules and images

Gitlab repositories

Biodebian organisation

Submitting jobs to the cromwell server on biodebian

Blog posts on GeDaC webpage

About

Releases

Packages

Languages

kane9530/bulkRNAseq-WDL-gedac

Folders and files

Latest commit

History

Repository files navigation

RNAseq README

Flowchart

Pipeline overview

Outputs

Docker images

Naming convention [Documentation]

Modules and images

Gitlab repositories

Biodebian organisation

Submitting jobs to the cromwell server on biodebian

Blog posts on GeDaC webpage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages