Skip to content

Another attempt at supporting non-contiguous arrays #171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 25 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
7d09e44
Fix slicing with negative stride.
Feb 15, 2018
27bcf76
Commit only DeferredSourceModule support without changing calling beh…
Feb 19, 2018
be4014c
Smarter _new_like_me that handles discontiguous input. Have copy() u…
Feb 19, 2018
afc3251
Make sure key is hashable.
Feb 21, 2018
9cb80f7
Allow existing kernel calls to use non-contiguous arrays by sending a…
Feb 21, 2018
b38c75f
Fix variable names.
Feb 21, 2018
1f6486b
Non-contiguous is OK now.
Feb 21, 2018
edcc44a
Forgot to remove non-contiguity check.
Feb 21, 2018
e23c943
Allow setting scalars.
Feb 21, 2018
9cfaf97
fix: update signature of gpuarray.reshape to match the GPUArray method
grlee77 Dec 6, 2017
080ec59
Add get_texref() to ElementwiseKernel
bailsman Feb 19, 2018
a7cb982
Update bpl-subset, possibly including pypy support
inducer Feb 27, 2018
fb10ffd
Make characterize.platform_bits work with Pypy (patch by Emanuel Riet…
inducer Feb 27, 2018
71ec966
Fix pytest script-based test invocation
inducer Feb 27, 2018
542cffa
Fix DeferredFunction.__call__, and change modulelazy to deferredmod.
Feb 22, 2018
8d278fb
Make sure 'texrefs' keyword arg is re-evaluated every time.
Feb 28, 2018
5a4a2fa
Store module in cache too.
Feb 28, 2018
05a5400
Send grid and block to _delayed_get_function
Feb 28, 2018
11108fe
Fix comment.
Feb 28, 2018
1a6228c
Add debug option to ElementwiseSourceModule.
Feb 28, 2018
b743698
Fix index calculation (found using _debug!)
Feb 28, 2018
7e57a72
Add shape to the key (so it needs to remain a tuple).
Feb 28, 2018
c5de070
Remove unnecessary format key.
Feb 28, 2018
61bd908
Fix kernel name.
Feb 28, 2018
33a0dd8
Fix _array_like_helper to work with non-contiguous arrays.
Feb 28, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion bpl-subset
Submodule bpl-subset updated 103 files
6 changes: 5 additions & 1 deletion pycuda/characterize.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,11 @@


def platform_bits():
return tuple.__itemsize__ * 8
import sys
if sys.maxsize > 2**32:
return 64
else:
return 32


def has_stack():
Expand Down
10 changes: 5 additions & 5 deletions pycuda/cumath.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ def f(array, stream_or_out=None, **kwargs):

func = elementwise.get_unary_func_kernel(func_name, array.dtype)
func.prepared_async_call(array._grid, array._block, stream,
array.gpudata, out.gpudata, array.mem_size)
array, out, array.mem_size)

return out
return f
Expand Down Expand Up @@ -77,7 +77,7 @@ def fmod(arg, mod, stream=None):

func = elementwise.get_fmod_kernel()
func.prepared_async_call(arg._grid, arg._block, stream,
arg.gpudata, mod.gpudata, result.gpudata, arg.mem_size)
arg, mod, result, arg.mem_size)

return result

Expand All @@ -94,7 +94,7 @@ def frexp(arg, stream=None):

func = elementwise.get_frexp_kernel()
func.prepared_async_call(arg._grid, arg._block, stream,
arg.gpudata, sig.gpudata, expt.gpudata, arg.mem_size)
arg, sig, expt, arg.mem_size)

return sig, expt

Expand All @@ -111,7 +111,7 @@ def ldexp(significand, exponent, stream=None):

func = elementwise.get_ldexp_kernel()
func.prepared_async_call(significand._grid, significand._block, stream,
significand.gpudata, exponent.gpudata, result.gpudata,
significand, exponent, result,
significand.mem_size)

return result
Expand All @@ -129,7 +129,7 @@ def modf(arg, stream=None):

func = elementwise.get_modf_kernel()
func.prepared_async_call(arg._grid, arg._block, stream,
arg.gpudata, intpart.gpudata, fracpart.gpudata,
arg, intpart, fracpart,
arg.mem_size)

return fracpart, intpart
2 changes: 1 addition & 1 deletion pycuda/curandom.py
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,7 @@ def rand(shape, dtype=np.float32, stream=None):
raise NotImplementedError;

func.prepared_async_call(result._grid, result._block, stream,
result.gpudata, np.random.randint(2**31-1), result.size)
result, np.random.randint(2**31-1), result.size)

return result

Expand Down
Loading