You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The identity connection here follows the same architecture of resnets. The residual part provides richer and better gradients when the network is deep. Considering the dynamics network unrolls 5 steps recurrently, the gradient flow of the final unroll can be much deeper (over 10 layers). Consequently, we add the identity connection here.
As for empirical results, we find that such an identity connection shapes a better reward model. We collect some datasets and try to predict the reward through supervised learning for these data. We find that the model with the identity connection has a lower test error of the reward prediction.
Thanks for your open-sourced code very much.
I'm a little confused about the reason for the identity connection of state encoding in DynamicsNetwork in model.py:
Why do we add this state encoding identity connection, rather than using action encoding, and what is its empirical impact on atari results?
Looking forward to your reply!
The text was updated successfully, but these errors were encountered: