Join our slack channel to have deeper discussions.
updated Landscape of DRL
Landscape of DRL
This project is built for people who are learning and researching on latest deep reinforcement learning methods.
Illustrations:
Recommendations and suggestions are welcome.
- Multiagent Reinforcement Learning by Marc Lanctot RLSS @ Lille 11 July 2019
- RLDM 2019 Notes by David Abel 11 July 2019
- A Survey of Reinforcement Learning Informed by Natural Language 10 Jun 2019 arxiv
- Challenges of Real-World Reinforcement Learning 29 Apr 2019 arxiv
- Ray Interference: a Source of Plateaus in Deep Reinforcement Learning 25 Apr 2019 arxiv
- Principles of Deep RL by David Silver
- University AI's General introduction to deep rl (in Chinese)
- OpenAI's spinningup
- The Promise of Hierarchical Reinforcement Learning 9 Mar 2019
- Deep Reinforcement Learning that Matters 30 Jan 2019 arxiv
- General non-linear Bellman equations 9 July 2019 arxiv
- Monte Carlo Gradient Estimation in Machine Learning 25 Jun 2019 arxiv
- Quantifying Generalization in Reinforcement Learning 20 Dec 2018 arxiv
- S-RL Toolbox: Environments, Datasets and Evaluation Metrics for State Representation Learning 25 Sept 2018
- dopamine
- StarCraft II
- tfrl
- chainerrl
- PARL
- Recurrent Value Functions 23 May 2019 arxiv
- Stochastic Lipschitz Q-Learning 24 Apr 2019 arxiv
- TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning 8 Mar 2018
- DISTRIBUTED PRIORITIZED EXPERIENCE REPLAY 2 Mar 2018
- Rainbow: Combining Improvements in Deep Reinforcement Learning 6 Oct 2017
- Learning from Demonstrations for Real World Reinforcement Learning 12 Apr 2017
- Dueling Network Architecture
- Double DQN
- Prioritized Experience
- Deep Q-Networks
- Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces arxiv
- Policy Gradient Search: Online Planning and Expert Iteration without Search Trees 7 Apr 2019 arxiv
- SUPERVISED POLICY UPDATE FOR DEEP REINFORCEMENT LEARNING 24 Dec 2018 arxiv
- PPO-CMA: Proximal Policy Optimization with Covariance Matrix Adaptation 5 Oct 2018 arxiv
- Clipped Action Policy Gradient 22 June 2018
- Expected Policy Gradients for Reinforcement Learning 10 Jan 2018
- Proximal Policy Optimization Algorithms 20 July 2017
- Emergence of Locomotion Behaviours in Rich Environments 7 July 2017
- Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning 1 Jun 2017
- Equivalence Between Policy Gradients and Soft Q-Learning
- Trust Region Policy Optimization
- Reinforcement Learning with Deep Energy-Based Policies
- Q-PROP: SAMPLE-EFFICIENT POLICY GRADIENT WITH AN OFF-POLICY CRITIC
- Self-Supervised Exploration via Disagreement 10 Jun 2019 arxiv
- Approximate Exploration through State Abstraction 24 Jan 2019
- The Uncertainty Bellman Equation and Exploration 15 Sep 2017
- Noisy Networks for Exploration 30 Jun 2017 implementation
- Count-Based Exploration in Feature Space for Reinforcement Learning 25 Jun 2017
- Count-Based Exploration with Neural Density Models 14 Jun 2017
- UCB and InfoGain Exploration via Q-Ensembles 11 Jun 2017
- Minimax Regret Bounds for Reinforcement Learning 16 Mar 2017
- Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models
- EX2: Exploration with Exemplar Models for Deep Reinforcement Learning
- Generalized Off-Policy Actor-Critic 27 Mar 2019
- Soft Actor-Critic Algorithms and Applications 29 Jan 2019
- The Reactor: A Sample-Efficient Actor-Critic Architecture 15 Apr 2017
- SAMPLE EFFICIENT ACTOR-CRITIC WITH EXPERIENCE REPLAY
- REINFORCEMENT LEARNING WITH UNSUPERVISED AUXILIARY TASKS
- Continuous control with deep reinforcement learning
- When to use parametric models in reinforcement learning? 12 Jun 2019 arxiv
- Model Based Reinforcement Learning for Atari 5 Mar 2019
- Model-Based Stabilisation of Deep Reinforcement Learning 6 Sep 2018
- Learning model-based planning from scratch 19 July 2017
- Variational Option Discovery Algorithms 26 July 2018
- A Laplacian Framework for Option Discovery in Reinforcement Learning 16 Jun 2017
- Robust Imitation of Diverse Behaviors
- Learning human behaviors from motion capture by adversarial imitation
- Connecting Generative Adversarial Networks and Actor-Critic Methods
- Bridging the Gap Between Value and Policy Based Reinforcement Learning
- Policy gradient and Q-learning
- End-to-End Robotic Reinforcement Learning without Reward Engineering 16 Apr 2019 arxiv
- Reinforcement Learning with Corrupted Reward Channel 23 May 2017
- Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination 18 Jun 2019 arxiv
- A Regularized Opponent Model with Maximum Entropy Objective 17 May 2019 arxiv
- Deep Q-Learning for Nash Equilibria: Nash-DQN 23 Apr 2019 arxiv
- Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning 4 Nov 2018
- INTRINSIC SOCIAL MOTIVATION VIA CAUSAL INFLUENCE IN MULTI-AGENT RL 19 Oct 2018
- QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning 30 Mar 2018
- Modeling Others using Oneself in Multi-Agent Reinforcement Learning 26 Feb 2018
- The Mechanics of n-Player Differentiable Games 15 Feb 2018
- Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments 10 Oct 2017
- Learning with Opponent-Learning Awareness 13 Sep 2017
- Counterfactual Multi-Agent Policy Gradients
- Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments 7 Jun 2017
- Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games 29 Mar 2017
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures 9 Feb 2018
- Reverse Curriculum Generation for Reinforcement Learning
- Trial without Error: Towards Safe Reinforcement Learning via Human Intervention
- Learning to Design Games: Strategic Environments in Deep Reinforcement Learning 5 July 2017
- Kickstarting Deep Reinforcement Learning 10 Mar 2018
- Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning 7 Nov 2017
- Distral: Robust Multitask Reinforcement Learning 13 July 2017
- Observational Learning by Reinforcement Learning 20 Jun 2017
- Meta-learning of Sequential Strategies 8 May 2019 arxiv
- Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables 19 Mar 2019 arxiv
- ProMP: Proximal Meta-Policy Search 16 Oct 2018 arxiv
- Unsupervised Meta-Learning for Reinforcement Learning 12 Jun 2018
- GAN Q-learning 20 July 2018
- Implicit Quantile Networks for Distributional Reinforcement Learning 14 Jun 2018
- Nonlinear Distributional Gradient Temporal-Difference Learning 20 May 2018
- DISTRIBUTED DISTRIBUTIONAL DETERMINISTIC POLICY GRADIENTS 23 Apr 2018
- An Analysis of Categorical Distributional Reinforcement Learning 22 Feb 2018
- Distributional Reinforcement Learning with Quantile Regression 27 Oct 2017
- A Distributional Perspective on Reinforcement Learning 21 July 2017
- Robust Reinforcement Learning for Continuous Control with Model Misspecification 18 Jun 2019 arxiv
- Verifiable Reinforcement Learning via Policy Extraction 22 May 2018 arxiv