Normalize advantage function #6

rarilurelo · 2017-01-05T05:39:51Z

Hi, thanks for your implementation of TRPO.

In https://github.com/wojzaremba/trpo/blob/master/main.py#L128-L132 you normalize an advantage function.
I couldn't find any description about this operation in the paper( https://arxiv.org/abs/1502.05477 ).
Why did you do that?

wojzaremba · 2017-01-08T19:46:04Z

I have found it in John Schulman's code. This normalization is biased, but it's sensible.

rarilurelo · 2017-01-09T08:33:51Z

Provide feedback