Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: minibatch data is not contiguous? #19

Open
akitakeuchi opened this issue Nov 9, 2015 · 2 comments
Open

Question: minibatch data is not contiguous? #19

akitakeuchi opened this issue Nov 9, 2015 · 2 comments

Comments

@akitakeuchi
Copy link

Hi,

Thank you for the great contribution. The program works fine with tinyshakespeare dataset and other dataset, however part of "train.py" code looks quite strange to me. Line 87-91:

for i in xrange(jump * n_epochs):
x_batch = np.array([train_data[(jump * j + i) % whole_len]
for j in xrange(batchsize)])
y_batch = np.array([train_data[(jump * j + i + 1) % whole_len]
for j in xrange(batchsize)])

While "train_data" is the source character sequence, x_data seems to consist of characters from separate positions, that is, from every "jump" distant positions. To train RNN, internal state must be carried over to next input, but this minibatch data seems to violate this input data continuity. I would appreciate if you explain why the code works fine. Thanks.

@benob
Copy link

benob commented Nov 13, 2015

As far as I understand, in general, a minibatch should process independent examples (for the gradient to be a good estimation of the global gradient). In RNNs, examples are not independent, but if we take the minibatch from far away characters, we get a good approximation. So the minibatch acts like a rake in which teeth are separated by the jump value, and which is moved from character to next.

@akitakeuchi
Copy link
Author

Thank you for the comment. I got the point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants