In this repository I'll be programming the cool exercises of the Book Reinforcement-Learning: An introduction by Sutton
Solution to the Exercise 4.5 of the book Reinforcement Learning by Sutton
example output 5x5
0 0 1 1 2 2
0 0 0 1 1 2
1 1 0 0 1 1
2 1 1 0 0 1
2 2 1 1 0 0
3 2 2 1 0 0
solution to the problem 4.9 of the book Reinforcement Learning by Sutton
Solution to the Example 6.5 of the book Reinforcement Learning by Sutton
Output 4 actions:
Output 8 actions:
Output stochastic:
Solution to the Example 6.6 of the book Reinforcement Learning by Sutton
Output