You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Previously, I got a good rl policy in a driving env using the original Dreamer v3 repo:
https://github.com/danijar/dreamerv3
Recently, I've been trying to reproduce it with sheeprl dreamer v3 but the result got a lot worse. So I want to make sure that the config in sheeprl can correspond to the original dreamer config.
replay_ratio & train_ratio
In sheeprl, replay_ratio = 0.5 means all the envs will collect 2 transitions before sampling one batch from the buffer to do training, right? In original dreamer v3, train ratio is 32 by default and it is used to calculate kwargs['samples_per_insert'], which is 0.5. Is this samples_per_insert equal to replay_ratio?
The reference is the make_repaly() in the main.py:
May I ask how do they correspond to each other. My current understanding is deter ---> recurrent_state_size, hidden---> hidden_size, depth ---> cnn_channels_multiplier, units ---> dense units, classes ---> ?
fabric.devices
This is purely a question for sheeprl fabric.devices. If I have two gpus,shall I set fabric.devices=2? I tried to set devices=2 and the fabric.world_size is still 1, so what does the world_size and devices mean?
The text was updated successfully, but these errors were encountered:
Hi @chrisgao99, thanks for asking.
The author updated the algorithm a few months ago. Before the changes, our repo was aligned with the original repo, now I have to check which changes he made.
I will let you know as soon as possible.
Hello,
Previously, I got a good rl policy in a driving env using the original Dreamer v3 repo:
Recently, I've been trying to reproduce it with sheeprl dreamer v3 but the result got a lot worse. So I want to make sure that the config in sheeprl can correspond to the original dreamer config.
In sheeprl, replay_ratio = 0.5 means all the envs will collect 2 transitions before sampling one batch from the buffer to do training, right? In original dreamer v3, train ratio is 32 by default and it is used to calculate kwargs['samples_per_insert'], which is 0.5. Is this samples_per_insert equal to replay_ratio?
The reference is the make_repaly() in the main.py:
buffer.size & replay.size
Is the buffer size in sheeprl the same to the replay size in original dreamer v3?
model size
In original dreamer v3, I use the model size200m :
In sheeprl the size parameters are different, for example dreamer_v3_XL:
May I ask how do they correspond to each other. My current understanding is deter ---> recurrent_state_size, hidden---> hidden_size, depth ---> cnn_channels_multiplier, units ---> dense units, classes ---> ?
fabric.devices
This is purely a question for sheeprl fabric.devices. If I have two gpus,shall I set fabric.devices=2? I tried to set devices=2 and the fabric.world_size is still 1, so what does the world_size and devices mean?
The text was updated successfully, but these errors were encountered: