You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I try cql in the pen binary environment, I find that for cql's value function always tend to diverge (tried mixing ratio 0.0 and 0.5, both for 5 random seeds). The critics give very large estimates, causing it could not make progress during online finetuning. any ideas or suggestions on how to fix this overestimation issue? I see double critic is already being used. Thanks so much!
Best,
Hai
The text was updated successfully, but these errors were encountered:
Hi, thanks for your work!
When I try cql in the pen binary environment, I find that for cql's value function always tend to diverge (tried mixing ratio 0.0 and 0.5, both for 5 random seeds). The critics give very large estimates, causing it could not make progress during online finetuning. any ideas or suggestions on how to fix this overestimation issue? I see double critic is already being used. Thanks so much!
Best,
Hai
The text was updated successfully, but these errors were encountered: