This project explores different decision-making algorithms in Reinforcement Learning. It includes various implementations for environments and algorithms, such as Atari games with Deep Q Networks, HalfCheetah with TD3 and PPO, and Maze environments for Markov Decision Processes (MDP) with policy iteration and value iteration.
- Atari + DQN: Implementation of Deep Q Network (DQN) for Atari environments.
- HalfCheetah + TD3, PPO: Contains implementations of Twin Delayed Deep Deterministic Policy Gradient (TD3) and Proximal Policy Optimization (PPO) for the HalfCheetah environment.
- MDP, Maze Environment: Implementation of Markov Decision Process (MDP) in a maze environment.
- Value Iteration + Policy Iteration: Demonstrates Value Iteration and Policy Iteration algorithms for MDPs.