diff --git a/reinforcement_learning/README.md b/reinforcement_learning/README.md index beeb9f7..34b2fc2 100644 --- a/reinforcement_learning/README.md +++ b/reinforcement_learning/README.md @@ -2,7 +2,7 @@ | Paper | Notes | Author| Start Date | End Date | |:-----:|:-----:|:-----:|:----------:|:--------:| -| [Off-Policy Actor-Critic](https://arxiv.org/abs/1205.4839) (ICML '12) | [HackMD](https://hackmd.io/@FtbpSED3RQWclbmbmkChEA/BkcB-xwvI/edit) | [Sharath](https://sharathraparthy.github.io/) | 06/04/2020 | 06/05/2020 | +| [Off-Policy Actor-Critic](https://arxiv.org/abs/1205.4839) (ICML '12) | [HackMD](https://hackmd.io/@FtbpSED3RQWclbmbmkChEA/BkcB-xwvI) | [Sharath](https://sharathraparthy.github.io/) | 06/04/2020 | 06/05/2020 | | [Combining Physical Simulators and Object-Based Networks for Control](https://arxiv.org/pdf/1904.06580.pdf) (ICRA '19) | [HackMD](https://hackmd.io/@FtbpSED3RQWclbmbmkChEA/Sy6GPG9MB) | [Sharath](https://sharathraparthy.github.io/)| 06/04/2020 | 07/04/2020 | | [Learning Agile and Dynamic Motor Skills for Legged Robots](https://arxiv.org/abs/1901.08652)| [HackMD](https://hackmd.io/@FtbpSED3RQWclbmbmkChEA/ByzYzEhVS) |[Sharath](https://sharathraparthy.github.io/) | 06/04/2020 | 07/04/2020 | | [PAC-Bounds-for-Multi-armed-Bandit](https://link.springer.com/chapter/10.1007/3-540-45435-7_18) | [HackMD](https://hackmd.io/saK7DdqCRnyBfN3HykLhlA) | [Raj Ghugare](https://github.com/RajGhugare19) | 05/04/2020 | 09/04/2020 |