-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Idea] Implement success rate in logging #175
Comments
I can help implement this, since I also wanted to expose the info dictionary to Python, such that we can later populate it with more useful info. If I got it correctly from the documentation ( |
Thanks, we can work on implementing this together. For now, a few things to consider are that some of these will be SB3 specific (the success flag), but we can highlight that with a comment somewhere in the plugin and make it optional, and other stuff like custom info should be framework agnostic (no issues there). There's an additional thing I wanted to implement at some point, and that's truncation. These are separate issues, but of course make sense to consider when we implement the changes. For success: I think we should let the user decide the criteria for success. An episode that reaches the restart timer is not necessarily unsuccessful (e.g. one env's goal could be to collect as many items as possible before the env restarts, there's not necessarily a clear success here, but the user could define an arbitrary success threshold for tracking if wanted). What do you think? |
Sounds good. I'll start working on a draft sometime this week and I'll ping you so we can ping-pong some ideas. Regarding the other idea: |
I've created 2 MRs for the first iteration:
I've tested it locally with a 2D environment I'm working on and so far so good |
Proposal:
In addition to reward, having an overview of the current success rate can be useful in many envs. This can be a very important metric in envs that have a clear goal (e.g. successfully landed for the 3DLander env, sucessfully parked for the 3DCarParking env, etc.).
It seems we could support this with SB3 by implementing:.
https://stable-baselines3.readthedocs.io/en/master/common/logger.html#rollout
Needs to be considered:
It would be great if this can be added in a way that doesn't affect previous envs (e.g. they either report always true, false, or don't show this statistic).
Info sending/receiving: Some modifications would be needed to the plugin and Python env code to send / receive info, optimally preserving compatibility with older envs that don't send info. Once we enable info sending, we can later also set the truncated/terminated flags.
Usage / plugin side changes:
(Just a potential usage example, the end episode method is implemented in the env code, not plugin, although we can consider simplifying the process with something like edbeeching/godot_rl_agents_plugin#20, however, that does break compatibility with existing envs)
For compatibility, possibly the simplest way would be to always report episode success as true by default, unless set by the user.
Optionally, we could also add a boolean arg to the sb3 example script that sets the monitor to report this stat or not.
The text was updated successfully, but these errors were encountered: