pycuda mem_get_ipc_handle() error on Windows 10 #32

goffredogiordano · 2017-04-07T13:16:16Z

I would like to run the theano_alexnet training from this useful github project.
My computer is a Windows 10 native-machine 64 bit Intel core i7. I use WinPython-64bit-3.4.4.4QT5 from WinPython 3.4.4.3, Visual Studio 2015 Community Edition Update 3, CUDA 8.0.44 (64-bit), cuDNN v5.1 (August 10, 2016) for CUDA 8.0, Git source control based on MinGW compiler and OpenBLAS 0.2.14. As fundamental python libraries Theano is 0.9.0beta1 version, Scipy is 0.19.0, Keras 1.2.2, Lasagne 0.2.dev1, Numpy 1.11.1, hickle 2.0.4, h5py 2.6.0, pycuda, pylearn2, zeromq. I received help from theano_group on google. I have successfully pre-processed a subset of the ImageNet data using the script generate_data.sh, which generated all of the expected folders and files. The subset of data that are used are compressed into 195 .hkl (hickle) files for validation (each file is about 50 Mb) in the folder Validation_Alexnet_b256_b_256.0 and 0000_0.hkl, 0000_1.hkl,...0194_0.hkl,0194_1.hkl files (each file is about 25 Mb) in the folder Validation_Alexnet_b256_b_128.0. In the training folder there are no files. When I'm trying to run the train.py it releases me these errors:

C:\deep_learning\alexnet>python train.py
WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

Using gpu device 0: GeForce GT 740M (CNMeM is enabled with initial size: 80.0% of memory, cuDNN 5105)
WARNING (theano.sandbox.cuda): The cuda backend is deprecated and will be removed in the next release (v0.10). Please switch to the gpuarray backend. You can get more information about how to switch at this URL:
https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

... building the model

conv (cudnn) layer with shape_in: (3, 227, 227, 256)
Process Process-1:
Traceback (most recent call last):
File "C:\deep_learning\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\multiprocessing\process.py", line 254, in _bootstrap
self.run()
File "C:\deep_learning\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\multiprocessing\process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "C:\deep_learning\alexnet\train.py", line 52, in train_net
model = AlexNet(config)
File "C:\deep_learning\alexnet\alex_net.py", line 62, in init
lib_conv=lib_conv,
File "./lib\layers.py", line 168, in init
dnn.dnn_conv(img=input_shuffled[:, :self.channel / 2,
File "C:\deep_learning\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\theano\tensor\var.py", line 540, in getitem
return theano.tensor.subtensor.advanced_subtensor(self, *args)
File "C:\deep_learning\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\theano\gof\op.py", line 604, in call
node = self.make_node(*inputs, **kwargs)
File "C:\deep_learning\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\theano\tensor\subtensor.py", line 2140, in make_node
index = tuple(map(as_index_variable, index))
File "C:\deep_learning\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\theano\tensor\subtensor.py", line 2081, in as_index_variable
return make_slice(idx)
File "C:\deep_learning\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\theano\gof\op.py", line 604, in call
node = self.make_node(*inputs, **kwargs)
File "C:\deep_learning\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\theano\tensor\type_other.py", line 39, in make_node
list(map(as_int_none_variable, inp)),
File "C:\deep_learning\WinPython-64bit-3.4.4.4Qt5\python-3.4.4.amd64\lib\site-packages\theano\tensor\type_other.py", line 20, in as_int_none_variable
raise TypeError('index must be integers')
TypeError: index must be integers

PyCUDA ERROR: The context stack was not empty upon module cleanup.

A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.

Someone could help me to know what it is wrong?
Thanks in advance for expert help and your time.
Greetings,
Goffredo

hma02 · 2017-04-07T15:50:26Z

@goffredogiordano

The main error here is a TypeError rather than PyCUDA Error. The PyCUDA Error shows when code exits without proper context clean up. Not sure why the title of this issue is about "mem_get_ipc_handle" though.

The TypeError seems more of a Windows-Theano related issue, as the code does not show this error on linux.

I will ask @nouiz and @abergeron about this and setup an issue there.

abergeron · 2017-04-07T17:47:19Z

It's not possible to use cuda ipc handles in windows, this only works on linux. It's a limitation that comes from CUDA so we can't do anything about it.

nouiz · 2017-04-07T22:34:27Z

I found some code change in the example that make it work on Windows. I forgot where it is. Look in others issue in this repo or on Theano mailing list about this project. This would probably slow down computation on Windows compared to Linux, but would make it work on Windows. Le ven. 7 avr. 2017 13:47, abergeron <[email protected]> a écrit :

…

It's not possible to use cuda ipc handles in windows, this only works on linux. It's a limitation that comes from CUDA so we can't do anything about it. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#32 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AALC-4cq8fmK9MWIehEsTGOplXau5NCKks5rtnaogaJpZM4M24gi> .

goffredogiordano · 2017-04-08T09:10:25Z

Thank you to everyone's helping me. I would like to check if someone resolved this problem @nouiz , but if it is a problem related to ipc handles in windows as suggested from @abergeron, I think there is actually no solution.
Thanks.

If someone should find some solutions, please let me known

nouiz · 2017-04-08T14:07:42Z

Look in other issue in this repo or Theano mailing list. It worked with a small patch on Windows. Le sam. 8 avr. 2017 05:10, goffredogiordano <[email protected]> a écrit :

…

Thank you to everyone's helping me. I would like to check if someone resolved this problem @nouiz <https://github.com/nouiz> , but if it is a problem related to ipc handles in windows as suggested from @abergeron <https://github.com/abergeron>, I think there is actually no solution. Thanks. If someone should find some solutions, please let me known — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#32 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AALC-_sNmDl8II1xwG5gGsaJXTIjNVjzks5rt08CgaJpZM4M24gi> .

goffredogiordano · 2017-04-08T17:33:32Z

I have resolved my issues thanks also to Theano google group.
In layers.py I have modified line 56 with center_margin = int((image_shape[2] - cropsize) / 2) because I used Python 3.x version. Then lines 104 to 107

self.filter_shape[0] = self.filter_shape[0] // 2
self.filter_shape[3] = self.filter_shape[3] // 2
self.image_shape[0] = self.image_shape[0] // 2
self.image_shape[3] = self.image_shape[3] // 2

and line 125 with input[:self.channel // 2, :, :, :])
and line 133 input[self.channel // 2:, :, :, :])
also line 168 dnn.dnn_conv(img=input_shuffled[:, :int(self.channel / 2),
and line 179 dnn.dnn_conv(img=input_shuffled[:, self.channel // 2:,
Then because I had some problems with TypeError regard with dtype constructor (I referred to the http://deeplearning.net/software/theano/library/tensor/basic.html) I resolved the other errors in alex_net.py in line 26 modifying y = T.ivector('y')

hma02 mentioned this issue Apr 7, 2017

TypeError on Windows 10, Python 3, CudaNdarray Backend Theano/Theano#5822

Closed

hma02 mentioned this issue Apr 14, 2017

error on Windows 10 #33

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pycuda mem_get_ipc_handle() error on Windows 10 #32

pycuda mem_get_ipc_handle() error on Windows 10 #32

goffredogiordano commented Apr 7, 2017

hma02 commented Apr 7, 2017 •

edited

Loading

abergeron commented Apr 7, 2017

nouiz commented Apr 7, 2017 via email

goffredogiordano commented Apr 8, 2017

nouiz commented Apr 8, 2017 via email

goffredogiordano commented Apr 8, 2017

pycuda mem_get_ipc_handle() error on Windows 10 #32

pycuda mem_get_ipc_handle() error on Windows 10 #32

Comments

goffredogiordano commented Apr 7, 2017

PyCUDA ERROR: The context stack was not empty upon module cleanup.

A context was still active when the context stack was being cleaned up. At this point in our execution, CUDA may already have been deinitialized, so there is no way we can finish cleanly. The program will be aborted now. Use Context.pop() to avoid this problem.

hma02 commented Apr 7, 2017 • edited Loading

abergeron commented Apr 7, 2017

nouiz commented Apr 7, 2017 via email

goffredogiordano commented Apr 8, 2017

nouiz commented Apr 8, 2017 via email

goffredogiordano commented Apr 8, 2017

A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.

hma02 commented Apr 7, 2017 •

edited

Loading