This package is intended to help overcome limitation when estimating a model's evolutionary history. It takes user provided files and parameters, and creates several random topologies for the user to then test using fastsimcoal28.
- Clone the CoalMiner repo onto your machine:
git clone https://github.com/raywray/CoalMiner.git
- Move into the CoalMiner directory:
cd CoalMiner
- Install all necessary conda packages and create a conda environment (will need to have anaconda or miniconda installed prior) with the following command:
conda env create -f environment.yml
- Activate the conda environment:
conda activate coalminer_env
CoalMiner requires only 2 types of input files: an SFS and a .yml
with user parameters.
Users can create their SFS's from .vcf
files using several packages, including using the PPP pipeline (this package is recommended)1. These files MUST be placed directly in the CoalMiner directly and can be accomplished with the following command:
cp [prefix]_joint*.obs CoalMiner/
# Where [prefix] is your chosen internal prefix for your files
This file must contain the follwing information
INPUT_PREFIX
: the output prefix the user would like to useNUM_POPS
: the number of populations being testedSAMPLE_SIZES
: as a bulleted list, the number of sampled individuals from each population
Additionally, prior distributions, ranges and types must be provided for (under the MODEL_PARAMS
section):
- mutation rates:
mutation_rate_dist
- effective population sizes:
effective_pop_size_dist
- migration rates:
migration_dist
- time in generations:
time_dist
- and an optional value for the maximum number of generations between events (default=1000):
max_time_between_events
Optional Parameters:
OUTPUT_DIR
: (optional) Path for outuptNUM_RANDOM_MODELS
: the number of random topologies to generate (defaulted to 100)
Example input .yml
files can be found in the example_input_files/
directory.
After the input have been created, running CoalMiner is very simple and can be done with the following command: python3 [path_to_coalminer.py] [path_to_user_input_yml]
For example:
python3 /Users/foo/Projects/CoalMiner/coalminer.py /Users/foo/Projects/coalminer_input.yml
CoalMiner generates random .est
and .tpl
files and saves them in directories titled {prefix}_random_model_1
, {prefix}_random_model_2
, etc., in the output directory. It also copies the provided SFS files into the respective model directories.
Any example files can be found in the example_input_files
directory. These files are used in the video tutorial. Run the following commands to see how the example files work (assuming you have navigated into the CoalMiner directory):
# 1. Copy homo sapiens example parameter files into coal miner directory
cp example_input_files/hom_sap_joint*.obs .
# 2. Run coalminer
python3 coalminer.py example_input_files/hom_sap_3_pop_model.yml
Footnotes
-
fastsimcoal will only run with specific SFS suffix names. See the OBSERVED SFS FILE NAMES section of the fastsimcoal manual. ↩