In this project, we aimed to identify genes that are involved in wild plant adaptation to different geographical niches such as desert, mountainous and coastal ranges. We applied a population genomics approach to quantify the level of genetic diversity, demographic patterns and genomic regions under positive selection in nine distinct S. chilense populations.
The first folder Accompanying_text_files contains all the text files used in different analyses and make figures. The other two folders contain all the bash and R scripts used for analysing and visualizing Illumina Whole genome sequence data from 99 wild tomato plants.
The following analyses were performed with the sequence data
- FastQC - Quality check of raw sequence data
- Single Nucleotide Polymorphism calling - SNP calling using GATK and PacBio reference genome scaffolded with Hi-C
- GATK quality check - Quality check of called SNPs bash one liners
- SNP annotation - SNP annotation with SnpEff
- Breadth and Coverage of sequence data - Estimating average coverage and breadth of each BAM file after aligning to the reference genome
- Get various statistics - Extract different statistics like heterozygosity, number of singletons, length of each scaffold etc.
- Admixture - Admixture analysis using linkage pruned SNP data to identify genetic clusters
- Site frequency spectrum and theta - SAF and theta estimation using ANGSD
- LD decay - Linkage-disequilibrium decay analysis in each population
- Phylogenetics - Maximum likelihood tree, neighbour joining tree, Splitstree analysis
- Population genetics - Estimating nucleotide diversity (pi), TajimaD, pairwise Fst and Dxy (absolute measure of divergence)
- Treemix - Treemix analysis to evaluate gene flow and migration
- Demographic analysis - Demographic analysis to see patterns of effective population changes using PSMC and MSMC2
- Selection scan - Selection scan analysis using SweeD and RAiSD
- Finding genes with Bedtools - Extract gene information based on a gene annotation file and significant genomic regions under selection
- GWAS - A demo script to perform GWAS using flowering time in S. chilense
- FastQC
- GATK
- bwa
- picard
- samtools
- vcftools
- bcftools
- plink
- Admixture
- ANGSD
- fasttree
- PGDSpider
- PopLDdecay
- Pixy
- SnpEff
- Treemix
- PSMC
- MSMC2
- RAiSD
- SweeD
- Bedtools
- vcf2gwas
- R
- Anik Dutta