-
Notifications
You must be signed in to change notification settings - Fork 174
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix severl problems related to history summarization
Summary: This diff fixes several issues related to history summarization. 1) Currently contextual bandit algorithms do not update history summarization module parameters. This diff makes it possible to update these parameters in neural_bandit and neural_linear_bandit algorithms. Other contextual bandit algorithms are tabular or linear algorithms and are not based on nn.module currently. So I assume that they will not use nn.module-based history summarization module. 2) Actor critic methods now have a separate optimizer for the history summarization module. Previously the history summarization module shares the optimizer with the actor. But there seems to be no reason for making this choice. Also, the TD3 update rule for updating history summarization parameters (self._actor_optimizer.step()) is problematic. This is because this update rule results in adding a zero gradient when computing the momentum and other statistics in many optimizers like Adam and RMSprop, while the zero gradient should have been ignored. I also thought about having a separate history summarization optimizer in value-based methods. But I don't think this is needed. The history summarization module can share the same optimizer as the value function. 3) set_history_summarization_module is now an abstract method of policy learner. All policy learners need to implement this function. So will not miss as we did for contextual bandit algorithms. 4) benchmark.py does not address StackingHistorySummarizationModule correctly. Fix this problem. 5) I also thought about whether we should have a target network for the history summarization module. I thought before that this should be straightforward but after thinking a bit more I think including the target network is actually tricky and it is not clear to me whether it is worth including it. Reviewed By: rodrigodesalvobraz Differential Revision: D65760816 fbshipit-source-id: cc338d418015622f17a0ef504fe3e5d401aeda22
- Loading branch information
1 parent
b94c5bd
commit 8458dbc
Showing
17 changed files
with
99 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters