Hidden factors paper demo #13

benanne · 2015-08-22T16:01:38Z

This is an updated version of the paper "Discovering hidden factors of variation in deep networks" by @briancheung et al. (http://arxiv.org/abs/1412.6583), which no longer relies on any external code. The previous version relied on an old version of the MNIST example.

The other notebook I did (Highway Networks) is coming soon as well :)

f0k · 2015-08-23T19:24:05Z

Looks good in general! Some comments (the diff is too long for github to display, so I'll comment here):

y_train=T.cast(theano.shared(y_train), 'int32')

Why don't you cast the data before turning it into a shared variable? I used np.uint8 in Lasagne's MNIST example so they're GPU compatible.

l_encoder1 = nn.layers.DenseLayer(l_in, num_units=num_hidden_units)

I wouldn't officially encourage to leave out the nonlinearity=... specification. It's less clear.

batch_size = 100
build_model(..., batch_size=batch_size)

You set the batch size to 100, but in the end you compile compute_z to process the full training set at once. That's playing with fire! You should pass batch_size=None to build_model, in case someone takes your Recipe and tries to add dropout or something.

iter_train = ...

Those pesky names again :) But it's your choice, I don't mind too much.

We'll make use of this to 'clamp' the observed and latent representation variables.

Cool, I like that!

Before In[20], you could add some explanation on what you're doing. Something like "We'll now generate two images of nine rows of digits: One has z1 varied from -.3 to .3 with z2 fixed to zero, the other has z1 fixed to zero and z2 varied."

And just to make it a little prettier, you could alter the numbers so In[11] is followed by In[12] ff. instead of In[20] (not crucial).

Very nice work!

benanne · 2015-08-30T17:37:42Z

Why don't you cast the data before turning it into a shared variable? I used np.uint8 in Lasagne's MNIST example so they're GPU compatible.

This carries over from the Theano deep learning tutorials, where they also have the labels as float and use T.cast to convert them to integers at runtime: http://deeplearning.net/tutorial/logreg.html#logreg

This is because supposedly you can only store floats on the GPU (the labels would be stored in host memory otherwise). Are you saying this is not the case? (no longer the case maybe?)

I wouldn't officially encourage to leave out the nonlinearity=... specification. It's less clear.

I don't think stuff in Recipes counts as 'official' in that sense though :) But fair enough, I can add them explicitly. In that case we should probably consider updating all Recipes content to match, I'm pretty sure there are already a few other examples that rely on rectify as the default.

You set the batch size to 100, but in the end you compile compute_z to process the full training set at once. That's playing with fire! You should pass batch_size=None to build_model, in case someone takes your Recipe and tries to add dropout or something.

It's all dense layers, so I figured it wouldn't be a problem. But you're right in that it doesn't really set a good example. I'll change it.

Those pesky names again :) But it's your choice, I don't mind too much.

What's wrong with these names? I think they are there because I was mimicing the old MNIST example code. What should I call them instead?

Before In[20], you could add some explanation on what you're doing. Something like "We'll now generate two images of nine rows of digits: One has z1 varied from -.3 to .3 with z2 fixed to zero, the other has z1 fixed to zero and z2 varied."

Good call.

And just to make it a little prettier, you could alter the numbers so In[11] is followed by In[12] ff. instead of In[20] (not crucial).

Will take care of that when everything else has been addressed.

f0k · 2015-09-01T21:11:26Z

This is because supposedly you can only store floats on the GPU (the labels would be stored in host memory otherwise). Are you saying this is not the case? (no longer the case maybe?)

I thought so. But you're right:

In [3]: type(theano.shared(np.zeros(1)))
Out[3]: theano.tensor.sharedvar.TensorSharedVariable

In [4]: type(theano.shared(np.zeros(1, dtype='float32')))
Out[4]: theano.sandbox.cuda.var.CudaNdarraySharedVariable

In [5]: type(theano.shared(np.zeros(1, dtype='uint8')))
Out[5]: theano.tensor.sharedvar.TensorSharedVariable

In [6]: type(theano.tensor.cast(theano.shared(np.zeros(1, dtype='float32')), 'int32'))
Out[6]: theano.tensor.var.TensorVariable

In the MNIST example it was different, there I just needed something I can hand to a compiled function, and int8 was good for that. So it should stay the way it is! (Maybe add a short comment if you feel like it.)

In that case we should probably consider updating all Recipes content to match, I'm pretty sure there are already a few other examples that rely on rectify as the default.

Hmm, you're right, the spatial transformer network has it explicitly specified for the classification network, but not for the localization network. The caffe/CIFAR10 example doesn't specify it. Well... maybe just leave it as it is.

What's wrong with these names?

Maybe it's just me, but they remind me of Python's iter and so I think they'd return an iterable or are generator functions, i.e., something you'd put in the head of a for loop, not the body. Lasagne/Lasagne#215 (comment) :)

I think they are there because I was mimicing the old MNIST example code.

Exactly! It doesn't die! :)

What should I call them instead?

For me, train_step() would fit better, or just train_fn() as in the new MNIST example (indicating that it's a compiled function, although I guess that's not terribly obvious either). But as I said, I don't feel strongly on those names, choose whatever suits you best!

benanne · 2015-09-26T20:15:30Z

I will try to finish this soon, just a bit occupied with other things at the moment. I haven't abandoned this PR!

briancheung · 2015-10-30T18:51:04Z

Let me know if there's anything I can help with to get this PR merged :)

benanne · 2015-10-31T12:37:01Z

just f0k's comments that need to be addressed. I should have some time to sort this out soon, but feel free to do it if you want this to be merged sooner :)

Hidden factors paper demo

bc5e2e4

ebenolson mentioned this pull request Aug 30, 2015

Highway Networks demo #14

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hidden factors paper demo #13

Hidden factors paper demo #13

benanne commented Aug 22, 2015

f0k commented Aug 23, 2015

benanne commented Aug 30, 2015

f0k commented Sep 1, 2015

benanne commented Sep 26, 2015

briancheung commented Oct 30, 2015

benanne commented Oct 31, 2015

Hidden factors paper demo #13

Are you sure you want to change the base?

Hidden factors paper demo #13

Conversation

benanne commented Aug 22, 2015

f0k commented Aug 23, 2015

benanne commented Aug 30, 2015

f0k commented Sep 1, 2015

benanne commented Sep 26, 2015

briancheung commented Oct 30, 2015

benanne commented Oct 31, 2015