Skip to content

Commit

Permalink
project management
Browse files Browse the repository at this point in the history
  • Loading branch information
lucidrains committed Nov 30, 2023
1 parent 26fc8d0 commit 5b0cc20
Showing 1 changed file with 7 additions and 3 deletions.
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,12 +21,16 @@ I will be keeping around the logic for Q-learning on single action just for fina
- [x] improvise decoder head variant, instead of concatenating previous actions at the frames + learned tokens stage. in other words, use classic encoder - decoder
- [ ] allow for cross attention to fine frame / learned tokens


- [ ] build out a simple dataset creator class, taking in the environment as an iterator / generator
- [ ] redo maxvit with axial rotary embeddings + sigmoid gating for attending to nothing. enable flash attention for maxvit with this change

- [ ] consult some RL experts and figure out if there are any new headways into resolving <a href="https://www.cs.toronto.edu/~cebly/Papers/CONQUR_ICML_2020_camera_ready.pdf">delusional bias</a>

- [ ] for exploration, allow for finely randomizing a subset of actions, and not all actions at once
- [ ] figure out if one can train with randomized orders of actions - order could be sent as a conditioning that is concatted or summed before attention layers
- [ ] build out a simple dataset creator class, taking in the environment as an iterator / generator
- [ ] offer an improvised variant where the first action token suggests the action ordering. all actions aren't made equal, and some may need to attend to past actions more than others
- [ ] see if the main idea in this paper is applicable to language models <a href="https://github.com/lucidrains/llama-qrlhf">here</a>
- [ ] consult some RL experts and figure out if there are any new headways into resolving <a href="https://www.cs.toronto.edu/~cebly/Papers/CONQUR_ICML_2020_camera_ready.pdf">delusional bias</a>
- [ ] redo maxvit with axial rotary embeddings + sigmoid gating for attending to nothing. enable flash attention for maxvit with this change

## Citations

Expand Down

0 comments on commit 5b0cc20

Please sign in to comment.