Error in Priority Update for Prioritized Replay #13

qfettes · 2018-06-03T06:33:34Z

It looks like you're updating the priorities in the replay buffer according to the weighted and squared TD error.

loss  = (q_value - expected_q_value.detach()).pow(2) * weights
prios = loss + 1e-5
replay_buffer.update_priorities(indices, prios.data.cpu().numpy())

However, the algorithm in the original paper updates the priority only according to the absolute value of the TD error, which is not weighted. I believe this is a mistake in your implementation

The text was updated successfully, but these errors were encountered:

mneira10 · 2019-09-26T22:01:12Z

Can confirm:

Transition priorities are updated with the magnitude of the TD error (lines 11-12).
Paper for reference

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in Priority Update for Prioritized Replay #13

Error in Priority Update for Prioritized Replay #13

qfettes commented Jun 3, 2018

mneira10 commented Sep 26, 2019

Error in Priority Update for Prioritized Replay #13

Error in Priority Update for Prioritized Replay #13

Comments

qfettes commented Jun 3, 2018

mneira10 commented Sep 26, 2019