Right now I worked on:
- Imitaion Learning: From Vanilla Imitation Learning to DAgger, where the agent learns from expert's demonstrations.
- Policy Gradient: Vanilla Policy Gradient, Neural Network Baselines, Advantage Estimation and Generalized Advantage Estimation.