Build and test DRL Algorithms in different environments. Each folder in the archive contains all the needed files to run the notebooks to train an agent. The target of this repository is to implement and experiment with different algorithms to learn and better understand the methods.
- [x] Deep Q-Network (DQN) for LunarLander-v2
- [x] Double Deep Q-Network (DDQN) for LunarLander-v2
- [x] Dueling Deep Q-Network (Dueling DQN) for LunarLander-v2
- [x] DDQN with Prioritized Experience Replay (PER) for LunarLander-v2
- [x] Dueling DDQN with PER for LunarLander-v2
- [x] Dueling DDQN with PER and N-step returns for LunarLander-v2
- [x] Rainbow DQN (Dueling DDQN with PER and N-step and Noisy Nets) for LunarLander-v2
- [x] Rainbow DQN (Dueling DDQN with PER and N-step and Noisy Nets) for Unity Banana Collector
- [x] REINFORCE for CartPole-v0
- [x] Proximal Policy Optimization (PPO) for LunarLander-v2
- [ ] Proximal Policy Optimization (PPO) for Unity Crawler (Multi-Agent)
- [x] Deep Deterministic Policy Gradient (DDPG) for Unity Reacher (Multi-Agent)
- [x] Soft Actor-Critic (SAC) for Continuous LunarLander-v2
- [x] Soft Actor-Critic (SAC) for Unity Reacher (Single-Agent)
- [x] AlphaZero for Connect4 (3x3, N=3)
- [x] Multi-Agent Deep Deterministic Policy Gradient (MADDPG) for Unity Tennis
- [ ] Multi-Agent Proximal Policy Optimization (MAPPO) for Unity Soccer
- [ ] QMIX for Overcooked Environment
- [ ] MuZero for Connect4
- Playing Atari with Deep Reinforcement Learning
- Human-level control through deep reinforcement learning
- Dueling Network Architectures for Deep Reinforcement Learning
- Prioritized Experience Replay
- Rainbow: Combining Improvements in Deep Reinforcement Learning
- Proximal Policy Optimization Algorithms
- Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
- Continuous control with deep reinforcement learning
- Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
- Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning
- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
- Addressing Function Approximation Error in Actor-Critic Methods
- Asynchronous Methods for Deep Reinforcement Learning
- Distributed Prioritized Experience Replay
- Continuous control with deep reinforcement learning
- Continuous Deep Q-Learning with Model-based Acceleration
- Continuous control with deep reinforcement learning