-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Rendering with EvalCallback does not render the initial or final state #1692
Comments
Hello, |
The evalcallback already specially constructs an environment for evaluation, and this environment should act slightly differently than typical VecEnvs (for example, stop stepping after reaching a terminal state and do not reset) . Therefore my current plan to fix is to subclass I'll have to create a check in the step function to make sure that step is never called with a done environment |
Side note, there's another discussion to be had about parallelizing the eval environment, aka passing a |
The two are related,
if you want that behavior, you should use a |
The two methods that should be overridden by the Pseudocode
|
What makes you think that?
so if you want to do something like that, gym wrapper will help you achieve what you want, no need for a new |
common/evaluation.py:62 Therefore |
Ah, I guess a user could pass a
Are you proposing a EvalGymWrapper after the environment has been vectorized? if not isinstance(env, VecEnv):
env = DummyVecEnv([lambda: env]) # type: ignore[list-item, return-value]
env = EvalWrapper(env) |
yes
No, I'm just saying that if a user wants to render the very last step, he/she can wrap the Gym env before creating a VecEnv with a Gym wrapper that calls render before and after the step, in order to render the final step. Rendering the first step will be solved by moving the call to |
Shouldn't rendering all the steps and states be the default behavior of the |
I see how my previous implementation wouldn't work if the function is passed a One change could to be to enforce that the caller pass a base We can continue to support I believe this implementation would allow first and last frames to be rendered, preserve multi-environment evaluation, and reduce complexity for the user since they not longer have to construct a vec_env themselves. What do you think? |
This might be better done with a wrapper. Still working on a PR... |
I was missing the last render.
In the eval callback, I included a conditional render:
|
🐛 Bug
EvalCallback
does not render initial state of the episode. This is an issue for environments which have dynamic/random initial states, as it makes it unclear to the user what the state looked like before the agent performed its first step.EvalCallback
does not render the last step of the episode, instead rendering the reset state. This robs the user of seeing the last (often crucial) step and is confusing for new users.This behavior is because the environment reset is performed within the
DummyVecEnv.step
function. I have a local implementation that fixes this issue and clearly renders both initial and final states, but I'd first like to confirm that this behavior should indeed be treated as a bug worthy of a feature/fix.To Reproduce
Relevant log output / Error message
Notice the first state
[0, 0]
is never rendered and neither is the last step, only the reset frame (in this case, also[0, 0]
)For a new user only looking at the rendered messages, this would be very confusing. Here's another example where it seems like the agent changed two values at once.
System Info
No response
Checklist
The text was updated successfully, but these errors were encountered: