phyluce-workflows

Purpose

These are additional workflows that we use with phyluce to accomplish a variety of redundant tasks (e.g. read mapping, contig correct, etc.).

Check out the code somewhere

git clone https://github.com/faircloth-lab/phyluce-workflows

Navigate to that location and install dependencies from conda environment
```
cd <path to wherever you cloned>
conda env create -f environment.yml
```
Activate conda environment
```
conda activate phyluce-workflows
```
Navigate to directory that contains the workflow you want to run, edit config file appropriately, and run snakemake:
```
cd mapping
# <edit config file or run using example>
# change cores to suit your system
snakemake --cores 1
```
Right now, the remaining workflows are built off of the mapping workflow, meaning that you need to run it first, regardless of other things that you run.

Mapping: Map raw reads to species specific contigs. Uses bwa, samtools, pandas, and a custom script to map reads, perform duplicate detection and marking, and compute coverage across contigs by a few metrics.
Contig-correction: Using pre-existing BAM files (perhaps from Mapping), use bcftools and depth of coverage information to call SNPs in contigs, remove bad calls, and output consensus of results. BAM files may also have been run through mapDamage. Filters for removal are --IndelGap 5 --SnpGap 5 --exclude 'QUAL<20 | DP<5 | AN>2', and sequences that are reduced to < 50 bp after filtering.
Phasing: Using pre-existing BAM files (perhaps from Mapping), use samtools to phase SNPs. The workflow phases SNPs using samtools, produces 0.BAM and 1.BAM files for each haplotype, then converts those to FASTA data representing each haplotype using pilon. Along with the FASTA files, pilon outputs a changes file and a vcf file for each haplotype. Probably still needs a little work to deal with low coverage FASTAs that are produced.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
contig-correction		contig-correction
mapping		mapping
phasing		phasing
tests/test-data		tests/test-data
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml