A pipline to detect variant in sequencing data of SARS-CoV-2

Installation

Bioconda is required to install the tool

Use the following command to install and activate the environment

conda env update --file https://raw.githubusercontent.com/wuaipinglab/sra2variant/main/environment.yml
conda activate sra2variant

If the above doesn't work because of the network issue, try

conda env update --file https://raw.githubusercontent.com/wuaipinglab/sra2variant/main/environment2.yml
conda activate sra2variant

Quickstart

WGS PE

First a reference genome in fasta format is needed. The following command downloads and store the genome as reference/NC_045512.2.fasta.

mkdir ./reference
wget https://raw.githubusercontent.com/wuaipinglab/sra2variant/main/sra2variant/data/NC_045512.2.fasta
mv NC_045512.2.fasta ./reference

Download reads files in sra format and store them in a separate directory. Here two sra files are stored in ./wgs_reads directory.

mkdir ./wgs_reads
prefetch -o ./wgs_reads/SRR14119630.sra SRR14119630
prefetch -o ./wgs_reads/SRR14119629.sra SRR14119629

Use the pipeline for WGS paired end reads. In this example, We use the reference genome ./reference/NC_045512.2.fasta to analyze all sra files in ./wgs_reads directory.

sra2variant-WGS-PE -r ./reference/NC_045512.2.fasta -i ./wgs_reads

ARTIC PE

First a reference genome in fasta format, artic primer in bed format and a amplicon assignment in tsv format are needed. The following command downloads and store the files in reference.

mkdir ./reference

wget https://raw.githubusercontent.com/wuaipinglab/sra2variant/main/sra2variant/data/NC_045512.2.fasta
mv NC_045512.2.fasta ./reference

wget https://raw.githubusercontent.com/wuaipinglab/sra2variant/main/sra2variant/data/ARTIC_nCoV-2019_v3.bed
mv ARTIC_nCoV-2019_v3.bed ./reference

wget https://raw.githubusercontent.com/wuaipinglab/sra2variant/main/sra2variant/data/ARTIC_amplicon_info_v3.tsv
mv ARTIC_amplicon_info_v3.tsv ./reference

Download reads files in sra format and store them in a separate directory. Here two sra files are stored in ./artic_reads directory.

mkdir ./artic_reads
prefetch -o ./artic_reads/SRR14388832.sra SRR14388832
prefetch -o ./artic_reads/SRR14398873.sra SRR14398873

Use the pipeline for WGS paired end reads. In this example, We use the reference genome ./reference/NC_045512.2.fasta to analyze all sra files in ./artic_reads directory.

sra2variant-ARTIC-PE -r ./reference/NC_045512.2.fasta \
                     -p ./reference/ARTIC_nCoV-2019_v3.bed \
                     -a ./reference/ARTIC_amplicon_info_v3.tsv \
                     -i ./artic_reads/

Other pipelines are under development

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github/workflows		.github/workflows
sra2variant		sra2variant
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
environment2.yml		environment2.yml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A pipline to detect variant in sequencing data of SARS-CoV-2

Installation

Quickstart

WGS PE

ARTIC PE

About

Releases

Languages

License

wuaipinglab/sra2variant

Folders and files

Latest commit

History

Repository files navigation

A pipline to detect variant in sequencing data of SARS-CoV-2

Installation

Quickstart

WGS PE

ARTIC PE

About

Resources

License

Stars

Watchers

Forks

Releases

Languages