Question about the `compute_lambda_values` function #302

zichunxx · 2024-06-21T13:00:05Z

Hi! Thanks for your work on reimplementing dreamverv1 in a simple way.

I tried to learn the computation process of dreamerv1, but feel confused about the logistics of the compute_lambda_values function:

sheeprl/sheeprl/algos/dreamer_v1/utils.py

Lines 66 to 77 in dee8c80

    
           last_values = torch.clone(last_values) 
        
           last_lambda_values = 0 
        
           lambda_targets = [] 
        
           for step in reversed(range(horizon - 1)): 
        
               if step == horizon - 2: 
        
                   next_values = last_values 
        
               else: 
        
                   next_values = values[step + 1] * (1 - lmbda) 
        
               delta = rewards[step] + next_values * done_mask[step] 
        
               last_lambda_values = delta + lmbda * done_mask[step] * last_lambda_values 
        
               lambda_targets.append(last_lambda_values) 
        
           return torch.stack(list(reversed(lambda_targets)), dim=0)

Does the above snippet refer to Eq.6 in the original paper? i.e.,

$$V_\lambda(s_\lambda) = (1- \lambda) \sum_{n-1}^{H-1} \lambda ^{n-1} V_N^n(s_\lambda) + \lambda ^{H-1} V_N^H(s_\lambda)$$

I could not find anything in common between them.

If so, what does delta mean? Is delta TD target?

I'm new to the Dreamer series. Please forgive me if my question looks dumb to you. Thanks.

The text was updated successfully, but these errors were encountered:

zichunxx · 2024-06-21T14:27:08Z

Update:

I think I have found the answer in Eq.4 of dreamerv2:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the `compute_lambda_values` function #302

Question about the `compute_lambda_values` function #302

zichunxx commented Jun 21, 2024 •

edited

Loading

zichunxx commented Jun 21, 2024

Question about the compute_lambda_values function #302

Question about the compute_lambda_values function #302

Comments

zichunxx commented Jun 21, 2024 • edited Loading

zichunxx commented Jun 21, 2024

Question about the `compute_lambda_values` function #302

Question about the `compute_lambda_values` function #302

zichunxx commented Jun 21, 2024 •

edited

Loading