Memory cost too much and do not start to train when using tensorflow1.4 #21

sunume · 2017-10-31T06:39:13Z

when training to cifar,it consist to increasing costed memory and do not begin to train.
I have a 2*16G RAM.It's seem to be enough.

1*GTX1080
cudnn6
tensorflow1.4
python3.5

taufikxu · 2017-12-07T22:52:43Z

I met the same problem.

pesser · 2017-12-10T15:38:10Z

See tensorflow/tensorflow#12598

kolesman · 2017-12-11T02:15:04Z

As @pesser pointed out the problem is caused by the broken data-dependent initialization mechanism.

I've implemented an alternative and more intuitive way of making data-dependent initialization a while ago. I've also just tried to merge my mechanism with the current pxpp++ code, please see https://github.com/kolesman/pixel-cnn.

Haven't checked the code extensively, but it seems to work. Let me know whether it also works for you,
then I will create a pull request.

SammyGelman · 2021-12-15T08:54:25Z

Just how memory intensive is pixelCNN++?

I've been fine training on smaller models but now that I've hit my wall I want to know exactly where and how the memory is being allocated.

I am currently training on images with size = 512x512, batch size = 5 and num_filters = 32.

I received a number of different errors:

OP_REQUIRES failed at cwise_ops_common.h:120 :Resource exhausted: OOM when allocating tensor with shape[5,64,256,256]

OP_REQUIRES failed at random_op.cc:77 : Resource exhausted: OOM when allocating tensor with shape[5,512,512,64]

etc...

I don't fully understand the shapes of these tensors. I see batch size there and when I play with the number of filters the 64 starts to change as well.

Any help would be much appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory cost too much and do not start to train when using tensorflow1.4 #21

Memory cost too much and do not start to train when using tensorflow1.4 #21

sunume commented Oct 31, 2017

taufikxu commented Dec 7, 2017

pesser commented Dec 10, 2017

kolesman commented Dec 11, 2017 •

edited

Loading

SammyGelman commented Dec 15, 2021

Memory cost too much and do not start to train when using tensorflow1.4 #21

Memory cost too much and do not start to train when using tensorflow1.4 #21

Comments

sunume commented Oct 31, 2017

taufikxu commented Dec 7, 2017

pesser commented Dec 10, 2017

kolesman commented Dec 11, 2017 • edited Loading

SammyGelman commented Dec 15, 2021

kolesman commented Dec 11, 2017 •

edited

Loading