Faster Convergence #51

vwxyzjn · 2022-01-27T04:26:49Z

Training an agent now still takes a long time. The particular experiment in #36 took 4d 9h 11m 14s to finish.

Looking at the reward chart, it appears the agent could achieve 70% of the final performance in just 50M steps (or about 10 hours into training)

We should try to optimize based on the 10 hours time computational budget.

vwxyzjn · 2022-01-27T04:34:11Z

The bottleneck I think is still largely on the NN side. So one thing worth trying is to reduce the NN size.

Alternatively, I noticed the learning rate annealing, in the end, seems to really help the algorithm converge. So maybe we could also try using a smaller learning rate and just turn off annealing.

Maybe we could tune with the discount factor (we should also visualize the discounted returns (what the agent actually optimized for).

vwxyzjn · 2022-01-31T23:05:18Z

#56 tries to address this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster Convergence #51

Faster Convergence #51

vwxyzjn commented Jan 27, 2022

vwxyzjn commented Jan 27, 2022

vwxyzjn commented Jan 31, 2022

Faster Convergence #51

Faster Convergence #51

Comments

vwxyzjn commented Jan 27, 2022

vwxyzjn commented Jan 27, 2022

vwxyzjn commented Jan 31, 2022