Calculate synonymous between-population sequence distance (D_XY)

This program calculates pairwise between-population sequence distance (D_XY) across genome-wide 4-fold synonymous SNPs.

Environment

Follow instructions in main documentation to set up the conda environmnet 'WGS_analysis'.

Code by step

Generate vcf files and mpileup files of 4-fold synonymous SNPs and analyzed sites from previously generated vcf and mpileup files: configure file paths and parameters in run_extract_4_syn_sites.sh and run it in bash (this step use the same code as synonymous F_ST).
Calculate D_XY from files generated using the same scripts for genome-wide D_XY: configure file paths and parameters in run_calc_syn_dxy.sh and run it in bash.

In most cases, you do not have to modify the core python scripts which counts the total number of analyzed sites and calculates D_XY within each chromosome/contig, as well as the paralleling shell script, which parallelly calculate within-contig D_XY and then merged results into genome-wide synonymous D_XY.

Similarly, you do not have to modify the core python script which extract 4-fold synonymous sites, and the paralleling shell script, which parallely extract sites within each chromosome/contig.

Input & output

Check variables in this shell script to see input and output files.

Notes

D_XY is an absolute measure of population differentiation, and is independent of levels of within-population diversity.
Though the formula is applicable to tri- or quadro-allelic sites, the calculation is limited to the two most frequent alleles for consistency with previous calculations of nucleotide diversity.
The two most frequent alleles doesn't have to be the same among samples, e.g., genotype A/T for sample1 and A/G for sample2, where dxy between sample1 and sample2 shoudl be calculated as AF(1A) x AF(2G) + AF(1T) x AF(2A) + AF(1T) x AF(2G).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Calculate synonymous between-population sequence distance (D_XY)

Environment

Code by step

Input & output

Notes

Files

README.md

Latest commit

History

README.md

File metadata and controls

Calculate synonymous between-population sequence distance (DXY)

Environment

Code by step

Input & output

Notes

Calculate synonymous between-population sequence distance (D_XY)