Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First results #14

Merged
merged 29 commits into from
Jul 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Models:
- `asm_esc.py`: provides `AsmEscEnv()` which inherits from `AsmEnv` and has one difference to it: actions in `AsmEscEnv()` represent escapement levels rather than fishing intensities.
- `ams_cr_like.py`: provides `AsmCRLike()`. In this environment, mean weight is observed and the action is to set parameters `(x1, x2, y2)` for a biomass-based harvest control rule of the type `CautionaryRule` (specified below).

Strategies evaluated with MSE:
Strategies evaluated with Bayesian Optimization:

- `agents.cautionary_rule.CautionaryRule`: piece-wise linear harvest-control rule specified by three parameters `(x1, x2, y2)`. Example plot (TBD).
- `agents.msy.Msy`: constant mortality harvest control rule. Specified by one parameter `mortality`.
Expand Down
42 changes: 22 additions & 20 deletions hyperpars/ppo-asm.yml
Original file line number Diff line number Diff line change
@@ -1,40 +1,42 @@
# algo
algo: "PPO"
total_timesteps: 4000000
total_timesteps: 40000000
algo_config:
tensorboard_log: "../../../logs"
#
policy: 'MlpPolicy'
batch_size: 512
gamma: 0.9999
learning_rate: !!float 7.77e-05
ent_coef: 0.00429
clip_range: 0.1
gae_lambda: 0.9
max_grad_norm: 5
vf_coef: 0.19
# learning_rate: 0.00015
policy_kwargs: "dict(net_arch=[256, 128, 128, 32])"
#
# batch_size: 512
# gamma: 0.9999
# learning_rate: !!float 7.77e-05
# ent_coef: 0.00429
# clip_range: 0.1
# gae_lambda: 0.9
# max_grad_norm: 5
# vf_coef: 0.19
# policy_kwargs: "dict(log_std_init=-3.29, ortho_init=False, net_arch=[256, 128])"
policy_kwargs: "dict(net_arch=[300, 200])"
# policy_kwargs: "dict(net_arch=[256, 128])"
use_sde: True
clip_range: 0.1
# clip_range: 0.1

# env
env_id: "AsmEnv"
config:
observation_fn_id: 'observe_1o'
n_observs: 1
harvest_fn_name: "trophy"
n_trophy_ages: 10
# upow: 0.6
# use_custom_harv_vul: True
# use_custom_surv_vul: True
observation_fn_id: 'observe_2o'
n_observs: 2
#
harvest_fn_name: "default"
# n_trophy_ages: 10
upow: 1
n_envs: 12

# io
repo: "cboettig/rl-ecology"
save_path: "../saved_agents/results/"
save_path: "../saved_agents/long_tests/"

# misc
id: "results-trophy-nage-10-1obs-hyperpars-larger-net"
id: "UM1-256-128-128-32-long"
# id: "short-test"
additional_imports: ["torch"]
43 changes: 30 additions & 13 deletions notebooks/for_results/0_tests.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 1,
"id": "86470282-d84b-49ff-a7e9-5647580b3d65",
"metadata": {},
"outputs": [],
Expand All @@ -26,7 +26,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 10,
"id": "2c293e58-45b2-41e1-881b-a85d9230aba0",
"metadata": {},
"outputs": [
Expand All @@ -47,45 +47,62 @@
" (policy_net): Sequential(\n",
" (0): Linear(in_features=2, out_features=64, bias=True)\n",
" (1): Tanh()\n",
" (2): Linear(in_features=64, out_features=64, bias=True)\n",
" (2): Linear(in_features=64, out_features=32, bias=True)\n",
" (3): Tanh()\n",
" (4): Linear(in_features=32, out_features=16, bias=True)\n",
" (5): Tanh()\n",
" )\n",
" (value_net): Sequential(\n",
" (0): Linear(in_features=2, out_features=64, bias=True)\n",
" (1): Tanh()\n",
" (2): Linear(in_features=64, out_features=64, bias=True)\n",
" (2): Linear(in_features=64, out_features=32, bias=True)\n",
" (3): Tanh()\n",
" (4): Linear(in_features=32, out_features=16, bias=True)\n",
" (5): Tanh()\n",
" )\n",
" )\n",
" (action_net): Linear(in_features=64, out_features=1, bias=True)\n",
" (value_net): Linear(in_features=64, out_features=1, bias=True)\n",
" (action_net): Linear(in_features=16, out_features=1, bias=True)\n",
" (value_net): Linear(in_features=16, out_features=1, bias=True)\n",
")"
]
},
"execution_count": 13,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('../../hyperpars/for_results/ppo_both_UM2.yml') as stream:\n",
" options = yaml.safe_load(stream)\n",
"\n",
"PPO(env=AsmEnv(config=options['config']),**options['algo_config']).policy"
"options['algo_config']['policy_kwargs'] = eval(options['algo_config']['policy_kwargs'])\n",
"PPO(env=AsmEnv(config=options['config']), **options['algo_config']).policy"
]
},
{
"cell_type": "markdown",
"id": "79d44b0b-25bf-4fe7-8ad1-c9e2ba181e54",
"cell_type": "code",
"execution_count": 11,
"id": "4e1695dd-212e-489c-9a0b-4459600e02b4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'net_arch': [64, 32, 16]}"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"They all seem to be of the form $\\{in\\}\\rightarrow 64\\rightarrow 64 \\rightarrow\\{out\\}$ by default."
"options['algo_config']['policy_kwargs']"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8fc31253-c12b-4777-abcb-2344477475ef",
"id": "6d79b258-1c1e-4b60-a276-97ddd2659634",
"metadata": {},
"outputs": [],
"source": []
Expand Down
218 changes: 49 additions & 169 deletions notebooks/for_results/1_fp_skopt.ipynb

Large diffs are not rendered by default.

Loading
Loading