Skip to content

Latest commit

 

History

History
17 lines (11 loc) · 493 Bytes

README.md

File metadata and controls

17 lines (11 loc) · 493 Bytes

Reinforcement-Learning-Intro

mdp_dp_solver.py

Model-based:

Markov Decision Process Model, Policy Iteration, Policy Improvement, Value Iteration Algorithm, and Maze MDP Example

monte_carlo.py

Model-free:

monte carlo method, epsilon-greedy policy exploration method, on-policy and off-policy

temporal_difference.py

Model-free:

temporal difference policy evaluation, greedy policy exploration SARSA, Qlearning and SARSA(equation)