Skip to content

Q-value iteration algorithm & ON-policy vs OFF-policy learning, introducing SARSA and Q-learning algorithms in the Stochastic Windy Grid environment

Notifications You must be signed in to change notification settings

Anca-Mt/TabularRL-StochasticWindyGridWorld

About

Q-value iteration algorithm & ON-policy vs OFF-policy learning, introducing SARSA and Q-learning algorithms in the Stochastic Windy Grid environment

Topics

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages