Trying to make it work on CPU #6

tkf · 2021-02-23T02:51:40Z

No description provided.

vchuravy · 2021-02-23T03:12:48Z

src/transduce.jl

    ill = @index(Local, Linear)
-    igl = @index(Group, Linear)
+    igl = @uniform @index(Group, Linear)


Suggested change

igl = @uniform @index(Group, Linear)

igl = @index(Group, Linear)

I thought I might need to put it in @uniform since I'm using it after @synchronize?

FoldsKernelAbstractions.jl/src/transduce.jl

Lines 173 to 175 in 18c7b8f

if t == 1

@inbounds dest[igl] = shared[1]

end

I can re-fetch the index, there, though

@index expression are always valid (and automatically inserted into the loop body).

Without @uniform, I'm now getting UndefVarError: igl not defined: https://github.com/JuliaFolds/FoldsKernelAbstractions.jl/pull/6/checks?check_run_id=1958277481

Actually, I just remember putting it in @uniform since it was the dependency of other @uniform variables:

FoldsKernelAbstractions.jl/src/transduce.jl

Lines 151 to 152 in 064aba9

offsetb = (igl - 1) * groupsize()[1]

bound = max(0, nbasecases - offsetb)

vchuravy · 2021-02-23T03:14:06Z

src/transduce.jl

-    t = ill
-    s = 1
-    c = nextpow(2, groupsize()[1]) >> 1
+    @private m = ill - 1


I suspect this form of @private is rather slow until KernelCompiler.jl lands. Using the older form is likely faster

piever · 2021-03-02T08:29:27Z

Not sure if this is helpful, but I've also played with block-level reduction with KernelAbstractions. I could get it to work on GPU (code similar to the one here), but not on CPU.

I tried replacing s and c with @private Int (1,) as well, but somehow using c[1] != 0 as a condition to break the while loop errors (KernelAbstractions still seems to think that s[1] is a vector rather than a scalar for some reason). Would love to know if a solution can be found here!

tkf · 2021-03-03T02:07:20Z

Yeah, I wonder if we need to tweak something in KernelAbstractions.jl or maybe just wait for KernelCompiler.jl

cc @vchuravy

tkf added 2 commits February 22, 2021 21:45

Enable CPU tests

bf3fd9e

Trying to make it work on CPU

18c7b8f

vchuravy reviewed Feb 23, 2021

View reviewed changes

tkf added 3 commits February 22, 2021 23:17

Merge branch 'master' into cpu

eef40e8

Use "classic" at-private syntax

9a04b07

Avoid using at-uniform

68d40dd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to make it work on CPU #6

Trying to make it work on CPU #6

tkf commented Feb 23, 2021

vchuravy Feb 23, 2021

tkf Feb 23, 2021

vchuravy Feb 23, 2021

tkf Feb 23, 2021

vchuravy Feb 23, 2021

piever commented Mar 2, 2021

tkf commented Mar 3, 2021

	igl = @uniform @index(Group, Linear)
	igl = @index(Group, Linear)

	offsetb = (igl - 1) * groupsize()[1]
	bound = max(0, nbasecases - offsetb)

Trying to make it work on CPU #6

Are you sure you want to change the base?

Trying to make it work on CPU #6

Conversation

tkf commented Feb 23, 2021

vchuravy Feb 23, 2021

Choose a reason for hiding this comment

tkf Feb 23, 2021

Choose a reason for hiding this comment

vchuravy Feb 23, 2021

Choose a reason for hiding this comment

tkf Feb 23, 2021

Choose a reason for hiding this comment

vchuravy Feb 23, 2021

Choose a reason for hiding this comment

piever commented Mar 2, 2021

tkf commented Mar 3, 2021