You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Could you please provide with a complete command for RAD on DMC? (for example for "CartPole-SwingUp" ?)
I cannot reproduce results of CartPole-SwingUp in the paper by running the command in script/run.sh.
It seems the command in run.sh is not completely the same as hyperparameters listed in the paper (like batch-size is 512 in the paper but 128 in run.sh). And I changed them but still cannot get the same result of the paper.
I'll list the command I run for these experiments:
SAC-pixel
It should attain reward≈200 after 100k env step (and 12.5k policy step since action_repeat = 8) but what I got is bigger (like 250 or 300)
I think image_size should be 108 when doing translate.
while when I change the image_size to 108, the result still has a large gap comparing to the result in the paper. Have you reproduced the experiment using translate?
Dear author,
Could you please provide with a complete command for RAD on DMC? (for example for "CartPole-SwingUp" ?)
I cannot reproduce results of CartPole-SwingUp in the paper by running the command in script/run.sh.
It seems the command in run.sh is not completely the same as hyperparameters listed in the paper (like batch-size is 512 in the paper but 128 in run.sh). And I changed them but still cannot get the same result of the paper.
I'll list the command I run for these experiments:
SAC-pixel
It should attain reward≈200 after 100k env step (and 12.5k policy step since action_repeat = 8) but what I got is bigger (like 250 or 300)
CUDA_VISIBLE_DEVICES=0 python train.py
--domain_name cartpole
--task_name swingup
--encoder_type pixel --work_dir ./tmp
--action_repeat 8 --num_eval_episodes 10
--pre_transform_image_size 100 --image_size 84
--agent rad_sac --frame_stack 3 --data_augs no_aug
--seed 234567 --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 2500 --batch_size 512 --num_train_steps 12500 --latent_dim 50
RAD(translate)
It should attain reward≈828 after 100k env step (12.5k policy step) but what I got is much smaller (around 50)
CUDA_VISIBLE_DEVICES=0 python train.py
--domain_name cartpole
--task_name swingup
--encoder_type pixel --work_dir ./tmp
--action_repeat 8 --num_eval_episodes 10
--pre_transform_image_size 100 --image_size 84
--agent rad_sac --frame_stack 3 --data_augs translate
--seed 234567 --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 2500 --batch_size 512 --num_train_steps 12500 --latent_dim 50
Sincerely look forward to your reply!
The text was updated successfully, but these errors were encountered: