[RL-baseline] Model v5, experiment #2 #44

ziritrion · 2021-04-02T09:59:20Z

Action set #1 was chosen for this experiment:
[0.0, 0.0, 0.0], # no action
[0.0, 0.8, 0.0], # throttle
[0.0, 0.0, 0.6], # break
[-0.9, 0.0, 0.0], # left
[-0.5, 0.0, 0.0], # left
[-0.2, 0.0, 0.0], # left
[0.9, 0.0, 0.0], # right
[0.5, 0.0, 0.0], # right
[0.2, 0.0, 0.0], # right

Entropy and Running Reward dropped to very low levels between the 8k and the 20k episode mark, but seemed to start a growing trend right before the 20k episode mark. It's likely that the running reward could grow to positive values if the model is trained further. The max RR value is 282 right before the 7k episode mark, with a final RR of -45.

Tensorboard captures below:

Sample video below. The car never turns.
https://user-images.githubusercontent.com/1465235/113405881-d6a82800-93aa-11eb-8a8a-4e32dd27b275.mp4

ziritrion added 2 commits March 31, 2021 12:14

Start of experiment 2 with moveset #1

9d61fe3

20k runs, running reward -45

c7f4725

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RL-baseline] Model v5, experiment #2 #44

[RL-baseline] Model v5, experiment #2 #44

ziritrion commented Apr 2, 2021

[RL-baseline] Model v5, experiment #2 #44

Are you sure you want to change the base?

[RL-baseline] Model v5, experiment #2 #44

Conversation

ziritrion commented Apr 2, 2021