Skip to content

Latest commit

 

History

History
40 lines (26 loc) · 1.54 KB

README.md

File metadata and controls

40 lines (26 loc) · 1.54 KB

MazeRL

Using Deep Q-Learning, the objective here is to train an agent to navigate through a maze. In this implementation, I have created agents that can handle discrete or continuous actions.

Alt text Alt text

Currently

Possible models the agent can use:

  • Deep Q-Learning
  • Deep Q-Learning with a Target Network
  • Double Deep Q-Learning

Actions:

  • For discrete actions, the agent can only consider up, down, left right. Each action moves by a fixed stride.
  • For continuous actions, the agent can only consider what angle it wants to take. The mean angle is sampled using the cross entropy method.

Tools:

  • Greedy Policy Tool that allows the user to visualise the current greedy policy that the agent has learnt. Images of the tool can be seen above.
  • Action Visual Tool which discretises the environment and shows the agents preferred order of discrete actions at any grid point. The actions the agent would take are strong yellow, with strong blue being the least preferred action that the agent would consider taking.

Alt text

Going Further

In the future, I would like to introduce:

  • Policy-based methods.