DNA Nanopore Genetic Disease Study Pipelines

Tooling to run primary, secondary and tertiary pipelines for the DNA Nanopore Genetic Disease Study

Requirements

linux
make
docker
50G disk space for the chromosome 11 sample
256GB memory (?)
32+ cores (?)

Quick Start

Clone this repo, create samples and references directories, download references, a chromosome 11 sample and run the sniffles variant caller with annotations:

git clone https://github.com/ucsc-upd/pipelines.git
cd pipelines
mkdir -p samples references
make samples/na12878-chr11/na12878-chr11.sniffles.ann.vcf

NOTE: The samples and references directories can be a symbolic links (i.e. to a scratch location or into a shared file system)

This will take approximately 30 minutes using 32 cores and generate the following output in samples/na12878-chr11:

1.3K Sep  8 11:00 minimap2.log
4.7G Sep  8 11:10 na12878-chr11.bam
3.6G Sep  8 10:25 na12878-chr11.fq.gz
  73 Sep  8 10:25 na12878-chr11.fq.gz.md5
5.8G Dec 21  2016 na12878-chr11.original.bam
 11G Sep  8 11:00 na12878-chr11.sam
3.1M Oct 22 12:51 na12878-chr11.sniffles.ann.vcf
835K Sep  8 11:29 na12878-chr11.sniffles.vcf
4.8G Sep  8 11:24 na12878-chr11.sorted.bam

Structural variant report

make samples/na12878-chr11/na12878-chr11.sv-report.html

Additional Samples

To process additional samples place their fastq in samples//.fq.gz and call make for any specific target. For example:

make samples/<id>/<id>.sniffles.vcf

Other Targets

We also use SVIM to call SVs from nanopore reads. The calls will be created to make the SV report but you can also create them with:

make samples/na12878-chr11/na12878-chr11.svim.vcf

Whole-genome sequencing with short-reads

The following commands assume that the following two FASTQ files exist: PATH/TO/FILE_R1.fq.gz and PATH/TO/FILE_R2.fq.gz. New files will be created in the same folder (PATH/TO in this example).

To just align the reads and run GATK post-alignment best practices:

make PATH/TO/FILE.sorted.RG.MD.BQSR.bam

To clean up (once the final BAM has been double-check), remove intermediate BAMs and files with:

make PATH/TO/FILE.sorted.RG.MD.BQSR.bam.clean_temp

To call structural variants with smoove:

make PATH/TO/FILE.sorted.RG.MD.BQSR.smoove.vcf.gz

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
hg38.fa.md5		hg38.fa.md5
secondary.sh		secondary.sh
sv-report.Rmd		sv-report.Rmd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DNA Nanopore Genetic Disease Study Pipelines

Requirements

Quick Start

Structural variant report

Additional Samples

Other Targets

Whole-genome sequencing with short-reads

About

Releases

Packages

Contributors 4

Languages

License

UCSC-Treehouse/dngds-pipelines

Folders and files

Latest commit

History

Repository files navigation

DNA Nanopore Genetic Disease Study Pipelines

Requirements

Quick Start

Structural variant report

Additional Samples

Other Targets

Whole-genome sequencing with short-reads

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages