Training an agent using RL tabular methods, namely TD learning on gym-minigrid
gym-minigrid - https://github.com/maximecb/gym-minigrid
The agent has been trained using 3 algoriths:
Sarsa0 Sarsa Lambda Q-Learning The agent was trained on 4 environments:
5x5 Empty room 6x6 Empty room 8x8 Empty room Four Rooms
Graphs between episode and rewards and episode and steps were plotted.
The agent controls the movement of a character in a grid world. Some tiles of the grid are walkable, and others lead to the agent falling into the water. Additionally, the movement direction of the agent is uncertain and only partially depends on the chosen direction. The agent is rewarded for finding a walkable path to a goal tile.
Gym is a toolkit for developing and comparing reinforcement learning algorithms. It makes no assumptions about the structure of your agent, and is compatible with any numerical computation library, such as TensorFlow or Theano.
The gym library is a collection of test problems — environments — that you can use to work out your reinforcement learning algorithms. These environments have a shared interface, allowing you to write general algorithms.