-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducing Table 3 Diffuser numbers #27
Comments
Hi, Sorry about the confusion, we normalized values to be between 0 and 100. The max reward in the environment is 3.0 so 1.6 corresponds to 1.6 / 3 * 100 = 53.3 reward. |
Ah that makes a lot of sense. It seems that the number is still a bit off the ones reported in the paper but closer now. Do the released pre-trained weights correspond to the final model that was used in constructing the table? Thanks in advance again for your help! |
I think the pre-trained weights should be very close to the values in the table, but I restructured the underlying code quite a bit and retrained a model based off the new codebase. As a result, they won't be exactly the same (but should be very close) -- let me know if there are any drastic differences. |
The score of 53.3 reward was with the pre-trained weights, which is definitely out of the std. Perhaps, this is due to the restructuring changes? If its not too much trouble would you be able to also independently verify what reward you get from your own pretrained weights? |
Hi, sure I went and reran the exact code that was released with the pretrained model and obtained performance 55.67 with a standard deviation of 2.4. I'm running the code with 1000 trials and will report the number after I get that also. |
Excellent! Thank you for re-running the models. I will close this issue after you report the number after 1000 trials. |
@joeybose Hi, I was trying to reproduce the results of walker2d-medium-replay-v2 in table 2 by training a new weight, but I didn't make it. The results are very different from the pretrained weights. Just curious, have you tried to train a new weight to reproduce the results? If so, does it work? Thanks! |
@Looomo I've only tried the Kuka experiments as that was what I was most interested in. Unfortunately, I didn't try the Mujoco experiments. |
@joeybose Thanks, I'll try other datasets and other seeds. |
Hi,
It's unclear how to reproduce the Diffuser column in Table 3 of the paper. For example, for the unconditional stacking experiment the report table number is
58.7 +/- 2.5
. When I try the provided script with the pretrained weights I get a reward mean of1.6 +/- 0.067
. How can I get roughly the same numbers as reported in the table? I'm sorry if I'm missing something obvious.The text was updated successfully, but these errors were encountered: