[Idea] Implement success rate in logging #175

Ivan-267 · 2024-02-17T08:06:56Z

Proposal:

In addition to reward, having an overview of the current success rate can be useful in many envs. This can be a very important metric in envs that have a clear goal (e.g. successfully landed for the 3DLander env, sucessfully parked for the 3DCarParking env, etc.).

It seems we could support this with SB3 by implementing:.

https://stable-baselines3.readthedocs.io/en/master/common/logger.html#rollout

success_rate: Mean success rate during training (averaged over stats_window_size episodes, 100 by default), you must pass an extra argument to the Monitor wrapper to log that value (info_keywords=("is_success",)) and provide info["is_success"]=True/False on the final step of the episode

Needs to be considered:

It would be great if this can be added in a way that doesn't affect previous envs (e.g. they either report always true, false, or don't show this statistic).

Info sending/receiving: Some modifications would be needed to the plugin and Python env code to send / receive info, optimally preserving compatibility with older envs that don't send info. Once we enable info sending, we can later also set the truncated/terminated flags.
Usage / plugin side changes:

func end_episode(final_reward = 0, success = true):
	reward += final_reward
	done = true
	needs_reset = true
	episode_successful = success

(Just a potential usage example, the end episode method is implemented in the env code, not plugin, although we can consider simplifying the process with something like edbeeching/godot_rl_agents_plugin#20, however, that does break compatibility with existing envs)

For compatibility, possibly the simplest way would be to always report episode success as true by default, unless set by the user.
Optionally, we could also add a boolean arg to the sb3 example script that sets the monitor to report this stat or not.

GianiStatie · 2024-10-16T09:57:40Z

I can help implement this, since I also wanted to expose the info dictionary to Python, such that we can later populate it with more useful info.

If I got it correctly from the documentation (the environment info dict must contain an is_success key to compute that value) we just need to add the "is_success" key to the Godot info. I think we can do this based out of whether the environment ended before or after the X simulations steps you set in the environment.

Ivan-267 · 2024-10-16T12:47:33Z

I can help implement this, since I also wanted to expose the info dictionary to Python, such that we can later populate it with more useful info.

If I got it correctly from the documentation (the environment info dict must contain an is_success key to compute that value) we just need to add the "is_success" key to the Godot info. I think we can do this based out of whether the environment ended before or after the X simulations steps you set in the environment.

Thanks, we can work on implementing this together. For now, a few things to consider are that some of these will be SB3 specific (the success flag), but we can highlight that with a comment somewhere in the plugin and make it optional, and other stuff like custom info should be framework agnostic (no issues there). There's an additional thing I wanted to implement at some point, and that's truncation. These are separate issues, but of course make sense to consider when we implement the changes.

For success: I think we should let the user decide the criteria for success. An episode that reaches the restart timer is not necessarily unsuccessful (e.g. one env's goal could be to collect as many items as possible before the env restarts, there's not necessarily a clear success here, but the user could define an arbitrary success threshold for tracking if wanted). What do you think?

GianiStatie · 2024-10-16T13:12:26Z

Sounds good. I'll start working on a draft sometime this week and I'll ping you so we can ping-pong some ideas.

Regarding the other idea:
When you say truncation, you mean terminating all agents once one of them is done?

GianiStatie · 2024-10-20T13:48:02Z

I've created 2 MRs for the first iteration:

on the plugin side: feat: adding info - is_success godot_rl_agents_plugin#46
on godot_rl_agents side: feat: adding info - is_success #208

I've tested it locally with a 2D environment I'm working on and so far so good

Ivan-267 added the enhancement New feature or request label Feb 27, 2024

This was referenced Oct 20, 2024

feat: adding info - is_success edbeeching/godot_rl_agents_plugin#46

Merged

feat: adding info - is_success #208

Merged

GianiStatie self-assigned this Oct 21, 2024

GianiStatie closed this as completed Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Idea] Implement success rate in logging #175

[Idea] Implement success rate in logging #175

Ivan-267 commented Feb 17, 2024 •

edited

Loading

GianiStatie commented Oct 16, 2024 •

edited

Loading

Ivan-267 commented Oct 16, 2024

GianiStatie commented Oct 16, 2024 •

edited

Loading

GianiStatie commented Oct 20, 2024

[Idea] Implement success rate in logging #175

[Idea] Implement success rate in logging #175

Comments

Ivan-267 commented Feb 17, 2024 • edited Loading

Proposal:

Needs to be considered:

GianiStatie commented Oct 16, 2024 • edited Loading

Ivan-267 commented Oct 16, 2024

GianiStatie commented Oct 16, 2024 • edited Loading

GianiStatie commented Oct 20, 2024

Ivan-267 commented Feb 17, 2024 •

edited

Loading

GianiStatie commented Oct 16, 2024 •

edited

Loading

GianiStatie commented Oct 16, 2024 •

edited

Loading