-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dreamer V3 Performance #218
Comments
@LYK-love, I will show you three experiments that we compared with the results described in the Dreamer V3 paper (https://arxiv.org/abs/2301.04104).
The reward we obtained in crafter with these configs. The paper claims to have achieved a reward of
We used these configs for training (+ MsPacman Test Reward
2020.0 (seed 5)
1070.0 (seed 1024)
2050.0 (seed 42)
1940.0 (seed 1337)
2630.0 (seed 8)
1760.0 (seed 2)
We used these configs for training (+ Boxing Test Reward
96.0 (seed 5)
92.0 (seed 1024)
96.0 (seed 42)
90.0 (seed 1337)
94.0 (seed 8)
96.0 (seed 2) Let me know if you have other questions regarding the performance of Dreamer V3. |
Great. Currently I have 8 GPUs, and I'm reproducing your performance with # Boxing
python sheeprl.py exp=dreamer_v3_100k_boxing fabric.strategy=ddp fabric.devices=8 fabric.accelerator=cuda , # Crafter
python sheeprl.py exp=dreamer_v3_XL_crafter fabric.strategy=ddp fabric.devices=8 fabric.accelerator=cuda and python sheeprl.py exp=dreamer_v3_100k_ms_pacman fabric.strategy=ddp fabric.devices=8 fabric.accelerator=cuda I will comment here once I get the result. Meanwhile, I also want to reproduce performance for other envs, like Atari Video Pinball and Star Gunner. Have you reproduced them? |
Hi @LYK-love, we have never tried those two environments. |
Hello, I get the rewards, but I think I made some mistakes. Crafter rewardThis is my training reward for I evaluated this trained agent with checkpoint at 200,000 steps. export CKPT="logs/runs/dreamer_v3/crafter_reward/2024-03-15_02-26-07_dreamer_v3_crafter_reward_5/version_0/checkpoint/ckpt_200000_0.ckpt"
sheeprl-eval checkpoint_path=$CKPT fabric.accelerator=gpu env.capture_video=True I got evaluation reward MsPacman 100KThis is my reward for Pacman. It runs for 100, 000 steps which is the same as the number in your config. However, the reward value is also lower--only Meanwhile, at step = 90, 000. I do observed a reward=1300, which is similar to I evaluated this trained agent with checkpoint at 100,000 steps, and set 6 seeds. The commands are export CKPT="logs/runs/dreamer_v3/MsPacmanNoFrameskip-v4/2024-03-15_02-20-34_dreamer_v3_MsPacmanNoFrameskip-v4_5/version_0/checkpoint/ckpt_100000_0.ckpt"
seeds=(5 1024 42 1337 8 2)
for seed in "${seeds[@]}"; do
sheeprl-eval checkpoint_path=$CKPT fabric.accelerator=gpu env.capture_video=True seed=$seed
done The evaluation rewards are
The average evaluation reward is BoxingThis is my training reward for Boxing. The training has 100, 000 steps which is the same as the number in your config. However, the reward is for 85,000 steps, instead of 100, 000 steps. I don't know why. The reward value is lower as well--only export CKPT="logs/runs/dreamer_v3/BoxingNoFrameskip-v4/2024-03-15_02-28-28_dreamer_v3_BoxingNoFrameskip-v4_5/version_0/checkpoint/ckpt_100000_0.ckpt"
seeds=(5 1024 42 1337 8 2)
for seed in "${seeds[@]}"; do
sheeprl-eval checkpoint_path=$CKPT fabric.accelerator=gpu env.capture_video=True seed=$seed
done The evaluation rewards are
The average evaluation reward is ConclusionI have two questions:
|
Hi @LYK-love,
In the meantime, I advise you not to distribute the training, at least not until we fix this. |
Sure. I wonder what is the commands for training. When I use python sheeprl.py exp=dreamer_v3_XL_crafter fabric.accelerator=cuda I got an error:
Meanwhile, I didn't get any error when running: python sheeprl.py exp=dreamer_v3_100k_boxing fabric.accelerator=cuda |
Can you share your environment? There may be a problem with the Let me know, thanks. |
Hi @LYK-love, The grey line is DreamerV3 trained with a single GPU, whereas the orange line is still Dreamer V3 trained on 2 GPUs. You can find the configs here. |
Hi @LYK-love, this is an experiment that I've run on Ms-PacMan: #261 (comment). It has been run with the torch.compie model, but it contains all the improvements @michele-milesi listed here |
@LYK-love I'm closing this due to both inactivity and it seems to have been resolved. Re-open it if you have more evidence on your side |
Hi @LYK-love, I am bringing your question on the performance of Dreamer v3 back here so that we can continue the conversation.
The text was updated successfully, but these errors were encountered: