v0.20.0: Hotfix for PPO with un-normalized env, `net_arch` support for PPO, additional fixes
LatestWhat's Changed
- Update PPO to support
net_arch
, and additional fixes by @araffin in #65 - fixed entropy coeff wrongly logged for SAC and derivatives.
- fixed PPO
predict()
for env that were not normalized (action spaces with limits != [-1, 1]) - PPO now logs the standard deviation
Full Changelog: v0.19.0...v0.20.0