From 53f07bef33efd08a07ef7d9b621fcc8e63f93baf Mon Sep 17 00:00:00 2001 From: amsks Date: Thu, 9 Nov 2023 17:15:58 +0100 Subject: [PATCH] updated the link to Four Rooms environment --- rl_exercises/week_4/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rl_exercises/week_4/README.md b/rl_exercises/week_4/README.md index 57c6cd7..cd2f91f 100644 --- a/rl_exercises/week_4/README.md +++ b/rl_exercises/week_4/README.md @@ -19,4 +19,4 @@ Use the [Hydra SMAC sweeper](https://github.com/automl/hydra-smac-sweeper.git) t ## Level 3 ### Implementing TD($\lambda$) -In the same format as the SARSA code, implement the TD($\lambda(n)$) algorithm on the [Gridcore environment](https://github.com/automl/TabularTempoRL/blob/master/grid_envs.py). Make $n$ a configurable parameter signifying the number of lookahead steps. Try to ablate the peformance for multiple values of $n$ and verify the theoretical claims in the lecture. +In the same format as the SARSA code, implement the TD($\lambda(n)$) algorithm on etiher the [Gridcore environment](https://github.com/automl/TabularTempoRL/blob/master/grid_envs.py), or the [Four Rooms environment](https://github.com/Farama-Foundation/Minigrid/blob/master/minigrid/envs/fourrooms.py). Make $n$ a configurable parameter signifying the number of lookahead steps. Try to ablate the peformance for multiple values of $n$ and verify the theoretical claims in the lecture.