Question: minibatch data is not contiguous? #19

akitakeuchi · 2015-11-09T08:34:50Z

Hi,

Thank you for the great contribution. The program works fine with tinyshakespeare dataset and other dataset, however part of "train.py" code looks quite strange to me. Line 87-91:

for i in xrange(jump * n_epochs):
x_batch = np.array([train_data[(jump * j + i) % whole_len]
for j in xrange(batchsize)])
y_batch = np.array([train_data[(jump * j + i + 1) % whole_len]
for j in xrange(batchsize)])

While "train_data" is the source character sequence, x_data seems to consist of characters from separate positions, that is, from every "jump" distant positions. To train RNN, internal state must be carried over to next input, but this minibatch data seems to violate this input data continuity. I would appreciate if you explain why the code works fine. Thanks.

benob · 2015-11-13T14:29:48Z

As far as I understand, in general, a minibatch should process independent examples (for the gradient to be a good estimation of the global gradient). In RNNs, examples are not independent, but if we take the minibatch from far away characters, we get a good approximation. So the minibatch acts like a rake in which teeth are separated by the jump value, and which is moved from character to next.

akitakeuchi · 2015-11-15T14:58:35Z

Thank you for the comment. I got the point.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: minibatch data is not contiguous? #19

Question: minibatch data is not contiguous? #19

akitakeuchi commented Nov 9, 2015

benob commented Nov 13, 2015

akitakeuchi commented Nov 15, 2015

Question: minibatch data is not contiguous? #19

Question: minibatch data is not contiguous? #19

Comments

akitakeuchi commented Nov 9, 2015

benob commented Nov 13, 2015

akitakeuchi commented Nov 15, 2015