Nextflow-pipeline

The current pipeline analyze the exon-seq data generated from either of Regeneron’s own pipeline (SPB) or Functionally Equivalent (FE) piplines from UKbiobank. Initially, the raw input data (.bed,.fam, .bai) are filtered and converted to vcf files using two plink2 processes. Then, the variants (vcf file) are annotated in the third process, using the ensembleVariant Effect Predictor (vep) tool. The corresponding result is then processed using an in-house tool called SEAK. The pipeline has a total of four processes. The tools used for all the four processes are containerized in the docker image

Installation

git clone https://github.com/HealthML/Nextflow-pipeline.git
cd Nextflow-pipeline
git checkout dev_nf

Nextflow

make install

Docker image installion

To install the docker image for all the process tools using Docker, run the Makefile command in the container directory.

cd container
make docker-build

How to run the pipeline

In order to run the pipeline for the data generated from Regeneron’s own pipeline (SPB) or Functionally Equivalent (FE) pipleine from UKbiobank using the VEP's cache references, please use the following command. For example, if you wanna run the samples from FE pipeline try the follwoing command on the terminal.

./nextflow run main.nf -resume --samples ukb_FE_50k_exome_seq

The pipeline downloads automatically hg38 fasta file. However, for the current pipeline I am using reference genome (.fa), the annoation file (.gtf) and their corresponding indexed files (.fai & .tbi files). For runing the pipeline using these references, please run the following command on the terminal.

./nextflow run main.nf --ref_fa /home/Alva.Rani/UKbiobank/derived/projects/kernels_VEP/Homo_sapiens.GRCh38.dna.primary_assembly.fa --gtf /home/Alva.Rani/data/reference/Homo_sapiens.GRCh38.97.gtf.gz --gtf_tbi /home/Alva.Rani/data/reference/Homo_sapiens.GRCh38.97.gtf.gz.tbi

If you can access the VM server and the above mentioned folder, there is index for the reference genome.

Otherwise, you can also run the whole pipeline by using the following one liner,

./nextflow run main.nf

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.nextflow		.nextflow
container		container
output/VEP		output/VEP
ref		ref
variants		variants
Makefile		Makefile
README.md		README.md
main.nf		main.nf
nextflow		nextflow
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nextflow-pipeline

Installation

Nextflow

Docker image installion

How to run the pipeline

About

Releases

Packages

Languages

HealthML/Nextflow-pipeline

Folders and files

Latest commit

History

Repository files navigation

Nextflow-pipeline

Installation

Nextflow

Docker image installion

How to run the pipeline

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages