Skip to content

Reinforcement Learning Project Decision Making in CarRacing Game using PPO and DAGGER

Notifications You must be signed in to change notification settings

seanxu889/CS5180_RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

CS5180 Reinforcement Learning

*Image-based Path Planning Method in CarRacing Game

Reinforcement Learning Project: Sequential Decision Making in CarRacing Game using Proximal Policy Optimization and Dataset Aggregation

Proximal Policy Optimization (PPO):

During the 2000-episode training process, we saved several models and evaluated the agent’s policy by observing how well the agent performed in the gym environment.

At the beginning of the training pro-cess, episode = 90, the agent had no idea on how to drive on the track. It drove slowly and with random steering (the green bar at the bottom shows the steering angle). After training around the 1400 episodes, we could observe that the agent had gained some knowledge about how to take actions. It is able to pass some of the simple turns with a relatively appropriate value of steering, gas and braking. But for the U-turn or S-turn, it sometimes still took the wrong actions and went out of the track. Also, there were times that the car cut straight across the turn from the green area.

After training 2000 episodes, the agent drove faster and more fluently as before. It is exciting to see that the agent already learnt to pass most of the turns with a relatively high precision and efficiency, including Right-angle turn, U-turns and Combined-turns.

image

However, sometimes the agent turned too fast with a serious skidding and almost rushed out of the track, even though it got back on the track by adjusting the steering and braking. We can observe that the car skids a whole circle in the U-turn, because it enters the U-turn in a very high speed without enough braking. That might be because the policy we learnt was stochastic and the agent might choose inappropriate actions.

image

DAGGER: image

About

Reinforcement Learning Project Decision Making in CarRacing Game using PPO and DAGGER

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages