Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hidden state propagation between batches #1

Open
stante opened this issue May 26, 2019 · 0 comments
Open

Hidden state propagation between batches #1

stante opened this issue May 26, 2019 · 0 comments

Comments

@stante
Copy link
Owner

stante commented May 26, 2019

How the training data is separated into batches and how the state propagation between batches is done seems not to be correct.

The whole input text is split into sequences of sequence length. The batch generation then takes the batch size amount of sequences and returns them for training and the hidden state is propagated between batches. This however means that each hidden state does not see the input sequentially but after a first sequence it skips over a batch size of sequences.

One idea how to solve this would be to split the whole input in batch size parts, and then iterate through each part during batch generation. It would be interesting to compare the performance of the existing approach with the one suggested in this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant