My Implementations of some common RL algorithms.
Algorithm | Source Code | Reference |
---|---|---|
Multi-armed bandit Greedy Policy | code | DeepMind x UCL RL Lecture Series - Exploration & Control |
Multi-armed bandit Epsilon Greedy Policy | code | DeepMind x UCL RL Lecture Series - Exploration & Control |
- DeepMind x UCL RL Lecture Series: Youtube.