Skip to content

Commit

Permalink
Merge pull request #16 from boettiger-lab/preprint-preparation
Browse files Browse the repository at this point in the history
Preprint preparation
  • Loading branch information
cboettig authored Dec 17, 2024
2 parents 3c4f98a + 61981df commit 7e1150b
Show file tree
Hide file tree
Showing 10 changed files with 134 additions and 17 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -159,4 +159,6 @@ cython_debug/
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

.DS_Store

saved_agents/
113 changes: 96 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,111 @@
# rl4fisheries

Models:

- `asm_env.py`: provides `AsmEnv()`. This encodes our population dynamics model, coupled with an observation process, and a harvest process with a corresponding utility model. These processes can all be modified using the `config` argument. Their defaults are defined in `asm_fns.py`. By default, observations are stock biomass and mean weight.
- `asm_esc.py`: provides `AsmEscEnv()` which inherits from `AsmEnv` and has one difference to it: actions in `AsmEscEnv()` represent escapement levels rather than fishing intensities.
- `ams_cr_like.py`: provides `AsmCRLike()`. In this environment, mean weight is observed and the action is to set parameters `(x1, x2, y2)` for a biomass-based harvest control rule of the type `CautionaryRule` (specified below).

Strategies evaluated with Bayesian Optimization:

- `agents.cautionary_rule.CautionaryRule`: piece-wise linear harvest-control rule specified by three parameters `(x1, x2, y2)`. Example plot (TBD).
- `agents.msy.Msy`: constant mortality harvest control rule. Specified by one parameter `mortality`.
- `agents.const_esc.ConstEsc`: constant escapement harvest control rule. Specified by one parameter `escapement`.
RL and Bayesian optimization methodologies for harvest control rule optimization in fisheries.
Includes:
- A gymnasium environment for a Walleye population dynamics model
- Policy functions for different commonly-tested policies (including those in the paper)
- Scripts to optimize RL policies and non-RL policies
- Notebooks to reproduce paper figures
- Templates to train new RL policies on our Walleye environment

## Installation

Clone this repo, then:
To install this source code, you need to have git, Python and pip installed.
To quickly check whether these are installed you can open the terminal and run the following commands:
```bash
git version
pip --version
python -V
```
If the commands are not recognized by the terminal, refer to
[here](https://github.com/git-guides/install-git)
for git installation instructions,
[here](https://realpython.com/installing-python/)
for Python installation instructions and/or
[here](https://pip.pypa.io/en/stable/installation/)
for pip installation instructions.

To install this source code, run
```bash
git clone https://github.com/boettiger-lab/rl4fisheries.git
cd rl4fisheries
pip install .
pip install -e .
```

## RL training:
## Optimized policies

The optimized policies presented in the paper---both RL policies and non-RL policies such as the precautionary policy---are saved in a public hugging-face
[repository](https://huggingface.co/boettiger-lab/rl4eco/tree/main/sb3/rl4fisheries/results).
RL policies are saved as zip files named ```PPO-AsmEnv-(...)-UMx-(...).zip``` since the RL algorithm PPO was used to optimize them.
Here *UM* stands for *utility model* and `x=1, 2, or 3` designates which utility model the policy was optimized for.
Precautionary policies are named `cr-UMx.pkl` (CR stands for "cautionary rule", an acronym we used during the research phase of this collaboration).
Similarly, constant escapement policies are saved as `esc-UMx.pkl` and FMSY policies are saved as `msy-UMx.pkl`.

## Reproducing paper figures

The Jupyter notebooks found at `rl4fisheries/notebooks/for_results` may be used to recreate the figures found in the paper.
Notice that the data for the plots is re-generated each time the notebook is run so, e.g., the time-series plots will look different.

Simply run
To reproduce these figures in your own machine you need to have Jupyter Notebooks installed, however you can navigate to
```https://github.com/boettiger-lab/rl4fisheries```
and click on `code > codespaces > Create codespace on main` to open the notebooks in a Github codespace.

## Optimizing RL policies

To optimize an RL policy from scratch, use the command
```bash
python scripts/train.py -f path/to/config/file.yml
```
The trained model is automatically pushed to Huggingface (requires a HF token).
The config files used for our results are found in `hyperpars/for_results/`
You can use the following template config file:
```bash
python scripts/train.py -f hyperpars/RL-template.yml
```

The config files we used for the policies in our paper are found at `hyperpars/for_results/`.
For example
[this](https://github.com/boettiger-lab/rl4fisheries/blob/main/hyperpars/for_results/ppo_biomass_UM1.yml)
config file was used to train 1-Obs. RL in Scenario 1 (utility = total harvest).
The trained model is automatically pushed to hugging-face if a hugging-face token is provided.

## Source code structure

```
rl4fisheries
|
|-- hyperpars
| |
| |-- configuration yaml files
|
|-- notebooks
| |
| |-- Jupyter notebooks
|
|-- src/rl4fisheries
| |
| |-- agents
| | |
| | |-- interfaces for policies such as Precautionary Policy, FMSY, Constant Escapement
| |
| |-- envs
| | |
| | |-- Gymnasium environments used in our study.
| | (specifically, asm_env.py is used for our paper).
| |
| |-- utils
| |
| |-- ray.py: RL training within Ray framework (not used in paper)
| |
| |-- sb3.py: RL training within Stable Baselines framework (used in paper)
| |
| |-- simulation.py: helper functions to simulate the system dynamics using a policy
|
|-- tests
| |
| |-- continuous integration tests to ensure code quality in pull requests
|
|-- noxfile.py: file used to run continuous integration tests
|
|-- pyproject.toml: file used in the installation of this source code
|
|-- README.md
```
36 changes: 36 additions & 0 deletions hyperpars/RL-template.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
algo: "PPO"
total_timesteps: 6000000
algo_config:
tensorboard_log: "../../../logs"
#
# use a feedforward neural net with three layers of 64, 32, and 16 neurons
policy: 'MlpPolicy'
use_sde: True
policy_kwargs: "dict(net_arch=[64, 32, 16])"
#
# you can add hyperparameter values here, e.g. by uncommenting the following row:
# learning_rate: 0.00015

# The environment simulating the population dynamics of Walleye
env_id: "AsmEnv"
config:
# configurations that specify the specifics of the environment:
#
# use one observation (vulnerable biomass)
observation_fn_id: 'observe_1o'
n_observs: 1
#
# use the "default" utility function:
harvest_fn_name: "default"
upow: 1

# helps paralellize training:
n_envs: 12

# save and upload models to hugging-face (needs hugging-face token)
repo: "boettiger-lab/rl4eco"
save_path: "../from-template/"
id: "from-template"

# misc, needed to use custom network structures (as in algo_config: policy_kwargs).
additional_imports: ["torch"]
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 comments on commit 7e1150b

Please sign in to comment.