-
Notifications
You must be signed in to change notification settings - Fork 19
logs20: Make a list of how we implement RL for seq2seq
Higepon Taro Minowa edited this page May 24, 2018
·
7 revisions
Log Type | Detail |
---|---|
1: What specific output am I working on right now? | A list of concrete steps of how we implement RL based seq2seq |
2: Thinking out loud - hypotheses about the current problem - what to work on next - how can I verify |
1. Draw diagrams to explain RL pattern. 1. Write concrete steps as list. |
3: A record of currently ongoing runs along with a short reminder of what question each run is supposed to answer | N/A |
4: Results of runs and conclusion | 1. done Save and commit 1. done Find a way to implement sample method considering how we backprop. 1. done Implement sample using empty note. done 1. Port the implementation. done 1.Think how we can keep reward=1 case. 1. Make a list of how we test it. |
5: Next steps | |
6: mega.nz | N/A |
- Train Op needs trainable_variable in the traing graph.
- Train Op requires logits (from sampling) to do backprop
- So we need to port sample to model not infer model.
- Connect all and see if it's not failing. (with reward == 1)
- Modify
- test small
- Initial training with normal seq2seq
- test medium
- Clean up all old code?



