Skip to content

Commit

Permalink
Feature/sbachmei/mic 5229 update concepts (#231)
Browse files Browse the repository at this point in the history
* Add blurb about specify draws and seeds
* add table of cli options; minor other updates
  • Loading branch information
stevebachmeier committed Sep 3, 2024
1 parent f8b1bad commit 2899ab8
Show file tree
Hide file tree
Showing 5 changed files with 103 additions and 36 deletions.
5 changes: 5 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
**2.0.2 - 08/30/24**

- Strengthen the API documentation
- Update existing documentation to include new psimulate options

**2.0.1 - 08/21/24**

- Use script to install dependencies in CI
Expand Down
24 changes: 23 additions & 1 deletion docs/source/branch.rst
Original file line number Diff line number Diff line change
Expand Up @@ -126,10 +126,32 @@ An example may make this clearer, so consider the following model specification.
It combines the two configuration keys we just learned about. Taken separately, the ``input_draw_count`` mapping would
lead to 100 simulations on 100 draws of input data while the ``random_seed_count`` mapping would lead to ten
simulations on with identical input data but a different seed for the random number generation. With both specified,
simulations with identical input data but a different seed for the random number generation. With both specified,
the result is 1,000 total simulations, one for each member of the Cartesian product of those sets. That is,
we would run ten simulations with the ten random seeds for each of the 100 input data draws.

Specifying Specific Draws and Seeds
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
By default, Vivarium chooses draws and seeds randomly. However, you can specify the draws and/or seeds you want to
use by providing a list of integers. For example, to run a simulation using input draws 4 and 8 and random seeds 15, 16,
23, and 42, you can use the following branch configuration:

.. code-block:: yaml
:caption: specific_draws_and_seeds.yaml
input_draw_count: 2
random_seed_count: 4
input_draws: [4, 8]
random_seeds: [15, 16, 23, 42]
It is valid to specify both ``input_draws`` and ``random_seeds`` (as shown above) or only one of them.

.. note::

The length of ``input_draws``, if provided, must match the value of ``input_draw_count``. Similarly, the length of
``random_seeds``, if provided, must match the value of ``random_seed_count``.

Configuration Variations
------------------------

Expand Down
96 changes: 69 additions & 27 deletions docs/source/distributed_runner.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,54 +6,96 @@ Running simulations in parallel
===============================

Once you successfully create a simulation specification and branch file it is
time to use the distributed runner. Recall, we ran a single simulation with
a model specification file in this way
time to use the distributed runner. Recall that we run a single simulation with
a model specification file in this way:

.. code-block:: console
simulate run /path/to/your/model/specification
simulate run <PATH-TO-MODEL-SPECIFICATION-YAML>
Very similar to this, ``vivarium-cluster-tools`` includes a command for simulating in parallel
Very similar to this single-simulation ``simulate`` command, ``vivarium-cluster-tools`` includes a
``psimulate`` command for running multiple simulations in parallel:

.. code-block:: console
psimulate run /path/to/your/model/specification /path/to/your/branch
psimulate run <PATH-TO-MODEL-SPECIFICATION-YAML> <PATH-TO-BRANCH-SPECIFICATION-YAML>
By default, output will be saved in ``/mnt/team/simulation_science/costeffectiveness/results``. If you want to save the
results somewhere else you can specify your output directory as an optional argument
In addition to providing the model specification and branches filepaths, you *must* provide an
output directory with the ``-o`` flag and which project you'd like to run on with the ``-P`` flag.

.. code-block:: console
psimulate run /path/to/your/model/specification /path/to/your/branch -o /path/to/output
Another optional argument is the cluster project under which to run the simulations. By default, the cluster project
used is ``proj_cost_effect``. To use a different project, specify it with the ``-P`` flag

.. code-block:: console
psimulate run /path/to/your/model/specification /path/to/your/branch -P proj_csu
Currently, the projects that simulation science has access to are ``proj_cost_effect``, ``proj_cost_effect_diarrhea``,
``proj_cost_effect_dcpn``, ``proj_cost_effect_conic``, and ``proj_csu``. Only these projects may be used.

If your ``psimulate run`` has failed to complete you can restart the failed jobs by specifying which output directory
includes the partially completed jobs using ``restart``
psimulate run <PATH-TO-MODEL-SPECIFICATION-YAML> <PATH-TO-BRANCH-SPECIFICATION-YAML> -o <PATH-TO-OUTPUT-DIRECTORY> -P <PROJECT>
``psimulate run`` also provides various *optional* flags which you can use to configure options for the run. These are:

.. list-table:: **Available** ``psimulate run`` **options**
:header-rows: 1
:widths: 30, 40

* - Option
- Description
* - | **-\-artifact_path** or **-i**
- | The path to a directory containing the artifact data file that the
| model requires. This is only required if the model specification
| file does not contain the artifact path or you want to override it.
* - | **-\-pdb**
- | If an error occurs, drop into the python debugger.
* - | **-\-verbose** or **-v**
- | Report each time step as it occurs during the run.
* - | **-\-backup-freq**
- | The frequency with which to save a backup of the simulation state to disk.
* - | **-\-no-batch**
- | Do not write results in batches; write them as they come in.
* - | **-\-redis**
- | Number of redis databases to use.
* - | **-\-max-workers** or **-w**
- | The maximum number of workers to run concurrently.
* - | **-\-hardware** or **-h**
- | A comma-separated list of the specific cluster hardware to run on.
| Refer to the --help for currently-supported opions.
* - | **-\-peak-memory** or **-m**
- | The maximum amount of memory to request per worker (in GB).
* - | **-\-max-runtime** or **-r**
- | The maximum amount of time to request per worker (hh:mm:ss). Note that
| the session you are launching the ``psimulate run`` from must also
| be able to live at least as long as this value (and this does not account)
| for the time jobs may spend in PENDING.
* - | **-\-queue** or **-q**
- | The queue to submit jobs to.
* - | **-\-help**
- | Print a help message and exit.

You can see a description of any of the available commands by using the **-\-help** flag, e.g. ``psimulate --help``
or ``psimulate run --help``.

Restarting a Simulation
-----------------------

If your ``psimulate run`` has jobs that failed to complete, you can restart them using ``psimulate restart``.
You must specify the results directory that includes the partially completed jobs as well as the project
you want to use for the restart.

.. code-block:: console
psimulate restart /path/to/the/previous/results/
psimulate restart <PATH-TO-PREVIOUS-RESULTS-DIRECTORY> -P <PROJECT>
For ``psimulate restart`` you can also choose a project with optional flag ``-P``.
Many of the same optional flags exist for ``psimulate restart`` as for ``psimulate run``. You can see a description of
these by using the ``psimulate restart --help``.

Expanding a Simulation
----------------------

If you wish to expand a previous ``psimulate run`` by adding additional input draws and/or random seeds, you can do so
using ``expand``.
If you wish to expand an existing simulation by running new simulations with additional input draws and/or random seeds,
you can do so using ``psimulate expand``. Just like for ``psimulate restart``, you must specify the results directory
that includes the results that you'd like to expand as well as a project. Further, you must specify the number of
additional draws and/or seeds you'd like to add to the simulation.

.. code-block:: console
psimulate expand /path/to/the/previous/results/ --add-draws 10 --add-seeds 5
psimulate expand <PATH-TO-PREVIOUS-RESULTS-DIRECTORY> -P <PROJECT> --add-draws 10 --add-seeds 5
You can use one or both of ``--add-draws`` and ``--add-seeds`` to expand your simulation. Any previous results will not
be overwritten, but any additional simulations resulting from the new input draws and/or random seeds will be run.

``psimulate expand`` also supports choosing a project via the option flag ``-P``.
As before, use ``psimulate expand --help`` to see a description of the available options.
8 changes: 4 additions & 4 deletions src/vivarium_cluster_tools/psimulate/cluster/cli_options.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,8 +73,8 @@ def with_queue_and_max_runtime(func):
default=PEAK_MEMORY_DEFAULT,
show_default=True,
help=(
"The estimated maximum memory usage in GB of an individual simulate job. "
"The simulations will be run with this as a limit."
"The memory request in GB of each individual simulation job. "
"The simulations will be killed if they exceed this limit."
),
)

Expand Down Expand Up @@ -165,10 +165,10 @@ def _validate_runtime_and_queue(runtime_string: str, queue: str):
default=MAX_RUNTIME_DEFAULT,
show_default=True,
help=(
f"The estimated maximum runtime ({_RUNTIME_FORMAT}) of the simulation jobs. "
f"The runtime request ({_RUNTIME_FORMAT}) of each individual simulation job. "
"The maximum supported runtime is 3 days. Keep in mind that the "
"session you are launching from must be able to live at least as long "
"as the simulation jobs, and that runtimes by node vary wildly."
"as the simulation jobs and that runtimes by node vary wildly."
),
callback=_queue_and_runtime_callback,
)
6 changes: 2 additions & 4 deletions src/vivarium_cluster_tools/psimulate/redis_dbs/cli_options.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
Command line options for configuring job/result queue Redis DBs in psimulate runs.
"""

import click

from vivarium_cluster_tools.psimulate.redis_dbs.launcher import (
Expand All @@ -29,8 +30,5 @@
type=click.IntRange(min=1),
default=8000,
show_default=True,
help=(
"The maximum number of workers (and therefore jobs) to run "
"concurrently. Defaults to the total number of jobs."
),
help=("The maximum number of workers (and therefore jobs) to run concurrently."),
)

0 comments on commit 2899ab8

Please sign in to comment.