using odlcuda #23

mehrhardt · 2017-07-18T09:54:14Z

This is kind of a continuation of issue odlgroup/odl#1074.

A few observations:

I can import odlcuda. The import also works without _install_location = __file__ but in the following I left it in.
The order matters. I first tried

import odlcuda
import odl

which causes odl not to know 'cuda' but

import odl
import odlcuda

works!

This is not really related to CUDA but still weird: I tried to test timings similar to my application.

domain_cpu = odl.uniform_discr([0], [1], [3e+8], impl='numpy')

and failed.

Traceback (most recent call last):

File "", line 4, in
domain_cpu = odl.uniform_discr([0], [1], [3e+8], impl='numpy')

File "/mhome/damtp/s/me404/store/repositories/git_ODL/odl/discr/lp_discr.py", line 1311, in uniform_discr
**kwargs)

File "/mhome/damtp/s/me404/store/repositories/git_ODL/odl/discr/lp_discr.py", line 1222, in uniform_discr_fromintv
**kwargs)

File "/mhome/damtp/s/me404/store/repositories/git_ODL/odl/discr/lp_discr.py", line 1136, in uniform_discr_fromspace
nodes_on_bdry)

File "/mhome/damtp/s/me404/store/repositories/git_ODL/odl/discr/partition.py", line 940, in uniform_partition_fromintv
grid = uniform_grid_fromintv(intv_prod, shape, nodes_on_bdry=nodes_on_bdry)

File "/mhome/damtp/s/me404/store/repositories/git_ODL/odl/discr/grid.py", line 1092, in uniform_grid_fromintv
shape = normalized_scalar_param_list(shape, intv_prod.ndim, safe_int_conv)

File "/mhome/damtp/s/me404/store/repositories/git_ODL/odl/util/normalize.py", line 149, in normalized_scalar_param_list
out_list.append(param_conv(p))

File "/mhome/damtp/s/me404/store/repositories/git_ODL/odl/util/normalize.py", line 396, in safe_int_conv
raise ValueError('cannot safely convert {} to integer'.format(number))

ValueError: cannot safely convert 300000000.0 to integer

Now the CUDA issue: Instead of 1D, I went 3D:

domain_gpu = odl.uniform_discr([0, 0, 0], [1, 1, 1], [4000, 300, 400], impl='cuda')
x_gpu = domain_gpu.one()

error:

Traceback (most recent call last):

File "", line 4, in
x_gpu = domain_gpu.one()

File "/mhome/damtp/s/me404/store/repositories/git_ODL/odl/discr/discretization.py", line 473, in one
return self.element_type(self, self.dspace.one())

File "/home/me404/.local/lib/python2.7/site-packages/odlcuda-0.5.0-py2.7.egg/odlcuda/cu_ntuples.py", line 912, in one
return self.element_type(self, self._vector_impl(self.size, 1))

RuntimeError: function_attributes(): after cudaFuncGetAttributes: invalid device function

Any idea what is wrong here?

The text was updated successfully, but these errors were encountered:

adler-j · 2017-07-18T10:01:05Z

The order matters. I first tried ...

Very interesting observation. With that said, you should as a user never explicitly import odlcuda.

domain_cpu = odl.uniform_discr([0], [1], [3e+8], impl='numpy')

This is not supposed to work, and the error message is quite explicit on why. 3e+8 is a floating point number and we require shape to be a integer, we're also cautious on casting since it opens up errors. We've had problems with people using shape = 1.5 or something like that, which is then cast to shape = 1, causing confusion.

The solution is to simply cast it to an integer yourself domain_cpu = odl.uniform_discr([0], [1], [int(3e+8)], impl='numpy'), or use powers domain_cpu = odl.uniform_discr([0], [1], [3 * 10 ** 8], impl='numpy')

... cudaFuncGetAttributes ...

This is actually covered in the readme, and the recommended solution is to change CUDA_COMPUTE to a version supported by your GPU. I need to know what GPU you have to fix that, or you can look it up yourself.

mehrhardt · 2017-07-18T10:21:28Z

If I run ccmake, then I only have ODL_CUDA_COMPUTE. Is that what you mean?
With this changed to 35 (omg is my GPU so old???) the above example works well! Many thanks!

mehrhardt · 2017-07-18T10:25:46Z

Here is now my example with timings. I hope this motivates you and shows that you are doing an incredible job!

domain_cpu = odl.uniform_discr([0, 0, 0], [1, 1, 1], [4000, 250, 350], impl='numpy')
domain_gpu = odl.uniform_discr([0, 0, 0], [1, 1, 1], [4000, 250, 350], impl='cuda')

x_cpu = np.e * domain_cpu.one()
x_gpu = np.e * domain_gpu.one()

% time y = x_cpu.ufuncs.log()
% time y = x_gpu.ufuncs.log()

CPU times: user 7.28 s, sys: 340 ms, total: 7.62 s
Wall time: 7.63 s
CPU times: user 1.36 ms, sys: 338 µs, total: 1.7 ms
Wall time: 1.64 ms

It looks I can finally run my code on the real PET data :D

mehrhardt · 2017-07-18T11:09:49Z

To then use CUDA for everything I do, I need to rewrite some functionals that I wrote in ODL. This causes me some trouble.
My strategy was to replace all numpy functionality with ufuncs, i.e. np.log(x) = x.ufunc.log(). Is this a good strategy? But when it comes to indices, I am struggling a little:

import odl
import numpy as np
domain_gpu = odl.uniform_discr([0, 0, 0], [1, 1, 1], [4000, 250, 350], dtype='float32', impl='cuda')
x_gpu = np.e * domain_gpu.one()
i_gpu = x_gpu.ufuncs.greater(0)
x_gpu[i_gpu]

results in

Traceback (most recent call last):

  File "<ipython-input-2-fd24fbf6952c>", line 4, in <module>
    x_gpu[i_gpu]

  File "/mhome/damtp/s/me404/store/repositories/git_ODL/odl/discr/discretization.py", line 314, in __getitem__
    return self.ntuple.__getitem__(indices)

  File "/home/me404/.local/lib/python2.7/site-packages/odlcuda-0.5.0-py2.7.egg/odlcuda/cu_ntuples.py", line 419, in __getitem__
    return self.data.__getitem__(indices)

ArgumentError: Python argument types in
    CudaVectorFloat32.__getitem__(CudaVectorFloat32, DiscreteLpElement)
did not match C++ signature:
    __getitem__(CudaVectorImpl<float> {lvalue}, long)

I kind of see what I am doing wrong, but not how to resolve this. Any ideas?

adler-j · 2017-07-18T11:12:57Z

Basically the problem here is that I've not implemented comparison between vectors and longs in odlcuda. The workaround for now is to compare to the zero vector, but if this is a performance hog for you i could get it fixed.

mehrhardt · 2017-07-18T15:25:20Z

I thought the problem here is the indexing, as I am indexing with an odl element and not with a "long". If I do your proposed fix, nothing changes.

mehrhardt · 2017-07-18T15:33:29Z

Also in the numpy case I am not sure what kind of indexing is necessary and what is not. Does

i = np.int32(data.ufuncs.greater(0).asarray().ravel())
log_data = data[i].ufuncs.log()

make any sense to you?

adler-j · 2017-07-19T13:36:55Z

Now I see what you are aiming at here. Now that would be complicated (mostly given that we don't support advanced indexing). I'd probably try some smooth approximation of the log function if I was you. You could also do something like:

pos = data.ufuncs.greater(epsilon * data.space.one())
log_data = (data * pos).ufunc.log()

Another option would be for you to manually add whatever function you have as raw cuda, modifying odlcuda shouldn't be too complicated.

With that said, the primary solution here is frankly to wait for holger to finalize the tensor branch and we'll get a really good backend for this stuff.

mehrhardt · 2017-10-28T19:27:25Z

@adler-j, am I correct in assuming that @kohr-h's tensor branch isn't in yet and that the above "problem" still exists?

I tried to look into odlcuda a little but could not get my head around it. Is it possible to just get a pointer to the data on the gpu device so that one could use any gpu code without needing to understand your structures in odlcuda?

mehrhardt · 2017-10-28T19:58:27Z

I think I found the answer to my question in cu_ntuples.py. Now I start to understand how odl is actually implemented :).

wangshuaikun · 2020-12-05T01:50:27Z

I would like to consult something(Please excuse me for not being good at English) , after I installed according to the official document:(https://odlgroup.github.io/odl/getting_started/installing_extensions.html) , the CPU version can work well, but CUDA can not, please help to analyze it, the installation process is as follows:
git clone https://github.com/odlgroup/odlcuda.git
cd odlcuda
conda install conda-build
git checkout conda-build
conda build ./conda CUDA_ROOT=/usr/lss/cudatoolkit-10.1.243-h74a9793_0 CUDA_COMPUTE=60
conda install --use-local odlcuda
python -c "import odl; odl.rn(3, impl='cuda').element()"

djx99 · 2022-10-07T07:12:22Z

@wangshuaikun did you install it?

mehrhardt changed the title ~~importing odlcuda~~ using odlcuda Jul 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

using odlcuda #23

using odlcuda #23

mehrhardt commented Jul 18, 2017 •

edited

Loading

adler-j commented Jul 18, 2017 •

edited

Loading

mehrhardt commented Jul 18, 2017

mehrhardt commented Jul 18, 2017 •

edited

Loading

mehrhardt commented Jul 18, 2017

adler-j commented Jul 18, 2017

mehrhardt commented Jul 18, 2017

mehrhardt commented Jul 18, 2017

adler-j commented Jul 19, 2017

mehrhardt commented Oct 28, 2017

mehrhardt commented Oct 28, 2017

wangshuaikun commented Dec 5, 2020 •

edited

Loading

djx99 commented Oct 7, 2022

using odlcuda #23

using odlcuda #23

Comments

mehrhardt commented Jul 18, 2017 • edited Loading

adler-j commented Jul 18, 2017 • edited Loading

mehrhardt commented Jul 18, 2017

mehrhardt commented Jul 18, 2017 • edited Loading

mehrhardt commented Jul 18, 2017

adler-j commented Jul 18, 2017

mehrhardt commented Jul 18, 2017

mehrhardt commented Jul 18, 2017

adler-j commented Jul 19, 2017

mehrhardt commented Oct 28, 2017

mehrhardt commented Oct 28, 2017

wangshuaikun commented Dec 5, 2020 • edited Loading

djx99 commented Oct 7, 2022

mehrhardt commented Jul 18, 2017 •

edited

Loading

adler-j commented Jul 18, 2017 •

edited

Loading

mehrhardt commented Jul 18, 2017 •

edited

Loading

wangshuaikun commented Dec 5, 2020 •

edited

Loading