This program calculates nucleotide diversity across genome-wide 4-fold synonymous SNPs, which is a way to minimize relative effects of sequencing errors on calculate diversity levels, as compared to all sites.
Follow instructions in main documentation to set up the conda environmnet 'WGS_analysis'.
Configure file paths and parameters in this shell script and run it in bash.
In most cases, you do not have to modify the core python scripts, which counts the total number of analyzed sites and extract 4-fold synonymous sites within each chromosome/contig, as well as the paralleling shell script, which parallelly calculates within-contig accumulative heterozygosity and then merged results into genome-wide nucleotide diversity.
Check variables in this shell script to see input and output files.
- Generate vcf files and mpileup files of 4-fold synonymous SNPs and filtered sites from previously generated vcf and mpileup files;
- Calculate nucleotide diversity from files generated.