You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! And thank you so much for this wonderful resource :) :)
I am currently working on montezuma's revenge, and have been trying to use your awesome codebase to better understand baselines that have been reported to work for montezuma's (e.g. Rainbow). I really enjoy your codebase because it is written in Pytorch rather than tensorflow or jax.
However, I have been unable to reproduce the reported result in the paper that Montezuma's should learn > 400 reward on rainbow, as I have not been able to get > 0 reward at all for any seeds.
I have been running:
python -u main.py --replay-frequency 1 --architecture canonical --game montezuma_revenge --reward-clip 1 --max-episode-length 1000000 --replay-frequency 16 --target-update int(3.2e4) --learn-start int(100e3)
Have you gotten rainbow to work on Montezuma (get > 0 reward), and what hyperparameters did you use? Thank you so much in advance for your kind help! :)
sunchipsster1
changed the title
Montezuma's revenge - have you tried it? :)
Montezuma's revenge - has this been tried using this codebase?
Nov 20, 2022
Back when I did release v1.3, as stated, I was unable to achieve any reward on Montezuma's Revenge (the only other result I couldn't match was on H.E.R.O.). However, there were a few changes to the codebase since, which hopefully might allow learning to happen.
I noticed that you are running with several hyperparameters that are different to the original paper. All you should need is python --game montezuma_revenge (with different seeds). So I would recommend trying that with a few seeds.
No description provided.
The text was updated successfully, but these errors were encountered: