Estimating flu clade and mutation frequencies
branch | URL |
---|---|
release | https://flu-frequencies.vercel.app/ |
master | https://master-flu-frequencies.vercel.app/ |
This README is for the data analysis pipeline. For the web interface, see web/README.md.
Currently not working due to lack of polars in Nextstrain managed environments (conda/docker)
On Linux:
curl -fsSL --proto '=https' https://nextstrain.org/cli/installer/linux | bash
On macOS:
curl -fsSL --proto '=https' https://nextstrain.org/cli/installer/mac | bash
You can set it up to use Docker or a Nextstrain managed conda environment (completely independent of any other conda environments you may have).
Using docker:
nextstrain setup --set-default docker
Using managed conda environment:
nextstrain setup --set-default conda
Run analysis:
nextstrain build . --profile profiles/flu
Install conda environment:
mamba env create -f environment.yml
Activate the environment:
conda activate flu_frequencies
Run for flu using:
snakemake --profile profiles/flu
Run for SARS-CoV-2 using:
snakemake --profile profiles/SC2
Copy snakemake workflow results to data_web/inputs
, ensuring that correct filenames are used, e.g.:
cp results/h3n2/continent-country-frequencies.csv data_web/inputs/flu-h3n2.csv
Then process the csv files into json:
python scripts/web_convert.py --input-pathogens-json data_web/inputs/pathogens.json --output-dir web/public/data
- Provide mamba environment file for simpler setup
- Agree on formatters to use (snakefmt and black?)