Taurine pangenome uncovers a segmental duplication upstream of KIT associated with breed-defining depigmentation in white-headed cattle
Our project is focused on associating the white-headed coat color phenotype observed in multiple cattle breeds including
- Hereford
- Simmental
and identifying a large structural variant upstream KIT.
With additional public short read data, covering other breeds with both the white- or color-headed phenotype, mapped directly to the graph, we could further validate that the alleles of the structural variant segregate with head depigmentation.
The main steps include:
- Pangenome graph construction per chromosome with pggb
- Download short reads data from public databases (fastq_dl)
- Align the short reads to the ARS-UCD1.2 bovine reference genome (strobealign) and extract the ones mapped in the region of interest (samtools)
- Align the short reads on the graph with vg giraffe and calculate the coverage per node in the graph (gafpack)
The coverages can then be visualised in Bandage to see sample coverage across the pangenome.
This can be generated with
snakemake -s Snakefile --rulegraph --configfile .test/config/test.yaml | dot -Tsvg > workflow.svg