Skip to content

Latest commit

 

History

History
18 lines (13 loc) · 1.31 KB

File metadata and controls

18 lines (13 loc) · 1.31 KB

Calculate genome-wide synonymous nucleotide diversity


This program calculates nucleotide diversity across genome-wide 4-fold synonymous SNPs, which is a way to minimize relative effects of sequencing errors on calculate diversity levels, as compared to all sites.

Environment

Follow instructions in main documentation to set up the conda environmnet 'WGS_analysis'.

How to run this program

Configure file paths and parameters in this shell script and run it in bash.

In most cases, you do not have to modify the core python scripts, which counts the total number of analyzed sites and extract 4-fold synonymous sites within each chromosome/contig, as well as the paralleling shell script, which parallelly calculates within-contig accumulative heterozygosity and then merged results into genome-wide nucleotide diversity.

Input & output

Check variables in this shell script to see input and output files.

Detailed steps

  1. Generate vcf files and mpileup files of 4-fold synonymous SNPs and filtered sites from previously generated vcf and mpileup files;
  2. Calculate nucleotide diversity from files generated.