Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flepimop.org documentation updates for information pertaining to new users #460

Merged
merged 11 commits into from
Feb 7, 2025
2 changes: 1 addition & 1 deletion batch/hpc_init.sh
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ read FLEPI_RUN_INDEX
cat << EOM
> The HPC init script has successfully finished.
If you are testing if this worked, say installing for the first time, you can use the inference example from the \`flepimop_sample\` repository:
If you are testing if this worked, say installing for the first time, you can use the inference example from the \`flepiMoP/examples/tutorials\` directory:
\`\`\`bash
cd \$PROJECT_PATH
flepimop-inference-main -c \$CONFIG_PATH -j 1 -n 1 -k 1
Expand Down
2 changes: 1 addition & 1 deletion documentation/gitbook/gempyor/output-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ These files contain the values of the variables for both the infection and (if i

Within the `model_output` directory in the project's directory, the files will be organized into folders named for the file types: `seir`, `spar`, `snpi`, `hpar`, `hnpi`, `seed`, `init`, or `llik` (see descriptions below). Within each file type folder, files will further be organized by the simulation name (`setup_name` in config), the modifier scenario names - if scenarios exist for either `seir` or `outcome` parameters (specified with `seir_modifiers::scenarios` and `outcome_modifiers::scenarios` in config), and the `run_id` (the date and time of the simulation, by default). For example:

<pre><code><strong>flepimop_sample
<pre><code><strong>flepiMoP/examples/tutorials
</strong>├── model_output
│   ├── {setup_name}_{seir_modifier_scenario}_{outcome_modifier_scenario}
│   │   └── run_id
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,28 +6,23 @@ description: Short tutorial on running locally using an "Anaconda" environment.

### Access model files

As is the case for any run, first see the [Before any run](../before-any-run.md) section to ensure you have access to the correct files needed to run. On your local machine, determine the file paths to:
Follow all the steps in the [Before any run](before-any-run.md) section to ensure you have access to the correct files needed to run your model with flepiMoP.

* the directory containing the flepimop code (likely the folder you cloned from Github), which we'll call `FLEPI_PATH`
* the directory containing your project code including input configuration file and population structure (again likely from Github), which we'll call `DATA_PATH`
Take note of the location of the directory on your local computer where you cloned the flepiMoP model code (which we'll call `FLEPI_PATH`).
emprzy marked this conversation as resolved.
Show resolved Hide resolved

{% hint style="info" %}
For example, if you clone your Github repositories into a local folder called Github and are using the flepimop\_sample as a project repository, your directory names could be\
For example, if you cloned your Github repositories into a local folder called `Github` and are using `flepiMoP/examples/tutorials` as a project repository, your directory names could be\
\
_**On Mac:**_

\<dir1> = /Users/YourName/Github/flepiMoP
/Users/YourName/Github/flepiMoP

\<dir2> = /Users/YourName/Github/flepimop\_sample\
/Users/YourName/Github/fleiMoP/examples/tutorials
\
_**On Windows:**_\
\<dir1> = C:\Users\YourName\Github\flepiMoP
C:\Users\YourName\Github\flepiMoP

\<dir2> = C:\Users\YourName\Github\flepimop\_sample\\

(hint: if you navigate to a directory like `C:\Users\YourName\Github` using `cd C:\Users\YourName\Github`, modify the above `<dir1>` paths to be `.\flepiMoP` and `.\flepimop_sample)`

:warning: Note again that these are best cloned **flat.**
C:\Users\YourName\Github\flepiMoP\examples\tutorials
{% endhint %}

## 🧱 Setup (do this once)
Expand Down Expand Up @@ -80,62 +75,45 @@ In this `conda` environment, commands with R and python will uses this environme

### Define environment variables

First, you'll need to fill in some variables that are used by the model. This can be done in a script (an example is provided at the end of this page). For your first time, it's better to run each command individually to be sure it exits successfully.
Since you'll be navigating frequently between the folder that contains your project code and the folder that contains the core flepiMoP model code, it's helpful to define shortcuts for these file paths. You can do this by creating environmental variables that you can then quickly call instead of writing out the whole file path.

First, in `myparentfolder` populate the folder name variables for the paths to the flepimop code folder and the project folder:
If you're on a **Mac** or Linux/Unix based operating system, define the FLEPI\_PATH and PROJECT\_PATH environmental variables to be your directory locations, for example

```bash
export FLEPI_PATH=$(pwd)/flepiMoP
export DATA_PATH=$(pwd)/flepimop_sample
export FLEPI_PATH=/Users/YourName/Github/flepiMoP
export PROJECT_PATH=/Users/YourName/Github/flepiMoP/examples/tutorials
```
emprzy marked this conversation as resolved.
Show resolved Hide resolved

Go into the code directory (making sure it is up to date on your favorite branch) and do the installation required of the repository:
or, if you have already navigated to your flepiMoP directory

```bash
cd $FLEPI_PATH # move to the flepimop directory
Rscript build/local_install.R # Install R packages
pip install --no-deps -e flepimop/gempyor_pkg/ # Install Python package gempyor
export FLEPI_PATH=$(pwd)
export PROJECT_PATH=$(pwd)/examples/tutorials
```

Each installation step may take a few minutes to run.

{% hint style="info" %}
Note: These installations take place in your conda environment and not the local operating system. They must be made once while in your environment and need not be done for every time you run a model, provided they have been installed once. You will need an active internet connection for installing the R packages (since some are hosted online), but not for other steps of running the model.
{% endhint %}

<details>

<summary>Help! I have errors in installation</summary>

If you get an error because no cran mirror is selected, just create in your home directory a `.Rprofile` file:
You can check that the variables have been set by either typing `env` to see all defined environmental variables, or typing `echo $FLEPI_PATH` to see the value of `FLEPI_PATH`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably shouldn't suggest people env - going to spam them confusingly.


{% code title="~/.Rprofile" lineNumbers="true" %}
```r
local({r <- getOption("repos")
r["CRAN"] <- "http://cran.r-project.org"
options(repos=r)
})
```
{% endcode %}
If you're on a **Windows** machine

Perhaps this should be added to the top of the local\_install.R script #todo
<pre class="language-bash"><code class="lang-bash"><strong>set FLEPI_PATH=C:\Users\YourName\Github\flepiMoP
</strong>set PROJECT_PATH=C:\Users\YourName\Github\flepiMoP\examples\tutorials
</code></pre>
emprzy marked this conversation as resolved.
Show resolved Hide resolved

When running `local_install.R` the first time, you may get an error:
or, if you have already navigated to your flepiMoP directory

<pre><code><strong>ERROR: dependency ‘report.generation’ is not available for package ‘inference’
</strong><strong>[...]
</strong><strong>installation of package ‘./R/pkgs//inference’ had non-zero exit status
</strong></code></pre>
<pre class="language-bash"><code class="lang-bash"><strong>set FLEPI_PATH=%CD%
</strong>set PROJECT_PATH=%CD%\examples\tutorials
</code></pre>

and the second time it'll finish successfully (no non-zero exit status at the end). That's because there is a circular dependency in this file (inference requires report.generation which is built after) and will hopefully get fixed.
You can check that the variables have been set by either typing `set` to see all defined environmental variables, or typing `echo $FLEPI_PATH$` to see the value of `FLEPI_PATH`.

For subsequent runs, once is enough because the package is already installed once.

</details>
{% hint style="info" %}
If you choose not to define environment variables, remember to use the full or relative path names for navigating to the right files or folders in future steps.
{% endhint %}

Other environmental variables can be set at any point in process of setting up your model run. These options are listed in ... ADD ENVAR PAGE
Other environmental variables can be set at any point in process of setting up your model run. These options are listed in ... **ADD ENVAR PAGE**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind a live TODO, but it should be in a comment, rather than actively displayed. Probably a better version is to convert this to an item on an explicit issue, e.g. the new environmental variables one. Basically an explicit note "in file XYZ, section ABC, ..."


For example, some frequently used environmental variables which we recommend setting are:
For example, some frequently used environmental variables we recommend setting are:
emprzy marked this conversation as resolved.
Show resolved Hide resolved

{% code overflow="wrap" %}
```bash
Expand All @@ -153,19 +131,19 @@ The next step depends on what sort of simulation you want to run: One that inclu
In either case, navigate to the project folder and make sure to delete any old model output files that are there.

```bash
cd $DATA_PATH # goes to your project repository
cd $PROJECT_PATH # goes to your project repository
rm -r model_output/ # delete the outputs of past run if there are
```

#### Inference run

An inference run requires a configuration file that has an `inference` section. Stay in the `$DATA_PATH` folder, and run the inference script, providing the name of the configuration file you want to run (ex. `config.yml`). In the example data folder (flepimop\_sample), try out the example config XXX.
An inference run requires a configuration file that has an `inference` section. Stay in the `$PROJECT_PATH` folder, and run the inference script, providing the name of the configuration file you want to run (ex. `config.yml`).
emprzy marked this conversation as resolved.
Show resolved Hide resolved

```bash
flepimop-inference-main.R -c config.yml
```

This will run the model and create [a lot of output files](../../gempyor/output-files.md) in `$DATA_PATH/model_output/`.
This will run the model and create [a lot of output files](../../gempyor/output-files.md) in `$PROJECT_PATH/model_output/`.

The last few lines visible on the command prompt should be:

Expand All @@ -191,7 +169,7 @@ where:

#### Non-inference run

Stay in the `$DATA_PATH` folder, and run a simulation directly from forward-simulation Python package `gempyor`. To do this, call `flepimop simulate` providing the name of the configuration file you want to run (ex. `config.yml`). An example config is provided in `flepimop_sample/config_sample_2pop_interventions.yml.`
Stay in the `$PROJECT_PATH` folder, and run a simulation directly from forward-simulation Python package `gempyor`. To do this, call `flepimop simulate` providing the name of the configuration file you want to run (ex. `config.yml`). An example config is provided in `PROJECT_PATH/config_sample_2pop_interventions.yml.`

```
flepimop simulate config.yml
Expand All @@ -203,23 +181,4 @@ It is currently required that all configuration files have an `interventions` se

You can also try to knit the Rmd file in `flepiMoP/flepimop/gempyor_pkg/docs` which will show you how to analyze these files.

### Do it all with a script

The following script does all the above commands in an easy script. Save it in `myparentfolder` as `quick_setup.sh`. Then, just go to `myparentfolder` and type `source quick_setup_flu.sh` and it'll do everything for you!

{% code title="quick_setup_flu.sh" lineNumbers="true" %}
```bash
export FLEPI_PATH=$(pwd)/flepiMoP
export DATA_PATH=$(pwd)/flepimop_sample

cd $FLEPI_PATH
Rscript build/local_install.R
pip install --no-deps -e gempyor_pkg/ # before: python setup.py develop --no-deps

cd $DATA_PATH
rm -rf model_output
export CONFIG_PATH=config.yml # set your configuration file path

flepimop-inference-main -j 1 -n 1 -k 1
```
{% endcode %}
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,7 @@ $ ./flepiMoP/build/hpc_install_or_update.sh <cluster-name>

These steps to initialize the environment need to run on a per run or as needed basis.

Change directory to where a full clone of the `flepiMoP` repository was placed (it will state the location in the output of the script above). And then run the `hpc_init.sh` script, substituting `<cluster-name>` with either `rockfish` or `longleaf`. This script will assume the same defaults as the script before for where the `flepiMoP` clone is and the name of the conda environment. This script will also ask about a project directory and config, if this is your first time initializing `flepiMoP` it might be helpful to clone [the `flepimop_sample` GitHub repository](https://github.com/HopkinsIDD/flepimop\_sample) to the same directory to use as a test.

Change directory to where a full clone of the `flepiMoP` repository was placed (it will state the location in the output of the script above). And then run the `hpc_init.sh` script, substituting `<cluster-name>` with either `rockfish` or `longleaf`. This script will assume the same defaults as the script before for where the `flepiMoP` clone is and the name of the conda environment. This script will also ask about a project directory and config, if this is your first time initializing `flepiMoP` it might be helpful to use configs out of `flepiMoP/examples/tutorials` directory as a test.
```
$ source batch/hpc_init.sh <cluster-name>
```
Expand All @@ -82,7 +81,7 @@ If you'd like to have more control, you can specify the arguments manually:
$ python $FLEPI_PATH/batch/inference_job_launcher.py --slurm \
-c $CONFIG_PATH \
-p $FLEPI_PATH \
--data-path $DATA_PATH \
--data-path $PROJECT_PATH \
emprzy marked this conversation as resolved.
Show resolved Hide resolved
--upload-to-s3 True \
--id $FLEPI_RUN_INDEX \
--fs-folder /scratch4/primary-user/flepimop-runs \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ description: >-
See the [Before any run](../before-any-run.md) section to ensure you have access to the correct files needed to run. On your local machine, determine the file paths to:

* the directory containing the flepimop code (likely the folder you cloned from Github), which we'll call `<dir1>`
* the directory containing your project code including input configuration file and population structure (again likely from Github), which we'll call `<dir2>`
* the directory containing your project code including input configuration file and population structure, which we'll call `<dir2>`
emprzy marked this conversation as resolved.
Show resolved Hide resolved

{% hint style="info" %}
For example, if you clone your Github repositories into a local folder called Github and are using the flepimop\_sample as a project repository, your directory names could be\
Expand All @@ -20,16 +20,12 @@ _**On Mac:** ;

\<dir1> = /Users/YourName/Github/flepiMoP

\<dir2> = /Users/YourName/Github/flepimop\_sample\
\<dir2> = /Users/YourName/Github/fleiMoP/examples/tutorials
\
_**On Windows:**_ \
\<dir1> = C:\Users\YourName\Github\flepiMoP

\<dir2> = C:\Users\YourName\Github\flepimop\_sample\


(hint: if you navigate to a directory like `C:\Users\YourName\Github` using `cd C:\Users\YourName\Github`, modify the above `<dir1>` paths to be `.\flepiMoP` and `.\flepimop_sample)`\

\<dir2> = C:\Users\YourName\Github\flepiMoP\examples\tutorials

Note that Docker file and directory names are case sensitive
{% endhint %}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ The meanings of the columns are:
For inference runs, `...` _flepiMoP_ produces one file per parallel slot, for both global and chimeric outputs...

```
flepimop_sample
flepiMoP/examples/tutorials
├── model_output
│   ├── seir
│   ├── spar
Expand Down
4 changes: 2 additions & 2 deletions examples/tutorials/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# flepimop_sample
# flepiMoP/examples/tutorials

This repository mirrors the contents in the **examples/tutorial** folder of the FlepiMoP repository ([link](https://github.com/HopkinsIDD/flepiMoP/tree/main/examples/tutorials)). It can be used to try out running simple projects using `flepimop` code and as a template for new projects.
This directory can be used to try out running simple projects using `flepimop` code and as a template for new projects. It mirrors now deprecated example repository `flepimop_sample`.
emprzy marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ options(readr.num_columns = 0)

# There are multiple ways to specify options when flepimop-inference-main is run, which take the following precedence:
# 1) (optional) options called along with the script at the command line (ie > flepimop-inference-main -c my_config.yml)
# 2) (optional) environmental variables set by the user (ie user could set > export CONFIG_PATH = ~/flepimop_sample/my_config.yml to not have t specify it each time the script is run)
# 2) (optional) environmental variables set by the user (ie user could set > export CONFIG_PATH = ~/examples/tutorials/my_config.yml to not have t specify it each time the script is run)
# If neither are specified, then a default value is used, given by the second argument of Sys.getenv() commands below.
# *3) For some options, a default doesn't exist, and the value specified in the config will be used if the option is not specified at the command line or by an environmental variable (iterations_per_slot, slots)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ required_packages <- c("dplyr", "magrittr", "xts", "zoo", "stringr")

# There are multiple ways to specify options when flepimop-inference-slot is run, which take the following precedence:
# 1) (optional) options called along with the script at the command line (ie > flepimop-inference-slot -c my_config.yml)
# 2) (optional) environmental variables set by the user (ie user could set > export CONFIG_PATH = ~/flepimop_sample/my_config.yml to not have t specify it each time the script is run)
# 2) (optional) environmental variables set by the user (ie user could set > export CONFIG_PATH = ~/examples/tutorials/my_config.yml to not have t specify it each time the script is run)
# If neither are specified, then a default value is used, given by the second argument of Sys.getenv() commands below.
# *3) For some options, a default doesn't exist, and the value specified in the config will be used if the option is not specified at the command line or by an environmental variable (iterations_per_slot, slots)

Expand Down