Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

char* doesn't work #716

Open
hakanrw opened this issue Aug 18, 2024 · 8 comments
Open

char* doesn't work #716

hakanrw opened this issue Aug 18, 2024 · 8 comments

Comments

@hakanrw
Copy link

hakanrw commented Aug 18, 2024

while testing some OpenCL examples, i found out that char* is not working on my system (Raspberry Pi 5).

code from hello_world.cl

/**
 * This kernel function only fills a buffer with the sentence 'Hello World!'.
 **/

__kernel void helloWorld(__global char* data){
    data[0] = 'H';
    data[1] = 'e';
    data[2] = 'l';
    data[3] = 'l';
    data[4] = 'o';
    data[5] = ' ';
    data[6] = 'W';
    data[7] = 'o';
    data[8] = 'r';
    data[9] = 'l';
    data[10] = 'd';
    data[11] = '!';
    data[12] = '\n';
    data[13] = 0;
}

it generates the following error

error: 28: Structure id 10 decorated as Block for variable in StorageBuffer storage class must follow relaxed storage buffer layout rules: member 0 contains an array with stride 1 not satisfying alignment to 4
  %_struct_10 = OpTypeStruct %_runtimearr_uint

Build Status: -2Build Log:	 clvk-RoBPoh/source.cl:22:1: warning: null character ignored
   22 | <U+0000>
      | ^
clvk-RoBPoh/source.cl:22:2: warning: no newline at end of file
   22 | <U+0000>
      |         ^

changing __global char* type to __global int* resolves the issue.

example source

@kpet
Copy link
Owner

kpet commented Aug 19, 2024

Thanks for the report. This is an issue in clspv. I suggest you create an issue in the clspv project directly: https://github.com/google/clspv/issues.

We can keep this one to track the need to update clspv when the fix lands.

@Rekt3421
Copy link
Contributor

Rekt3421 commented Sep 4, 2024

Definitely an issue with clspv.

leaving the kernel options here

-cl-single-precision-constant -cl-kernel-arg-info -rounding-mode-rte=16,32,64 -int8=0 -std430-ubo-layout -decorate-nonuniform -hack-convert-to-float -arch=spir --use-native-builtins=ceil,copysign,fabs,fdim,floor,fmax,fmin,half_cos,half_exp,half_exp10,half_exp2,half_rsqrt,half_sin,half_sqrt,half_tan,isequal,isfinite,isgreater,isgreaterequal,isinf,isless,islessequal,islessgreater,isnan,isnormal,isnotequal,isordered,isunordered,mad,rint,round,rsqrt,signbit,sqrt,trunc, -spv-version=1.6 -max-pushconstant-size=256 -max-ubo-size=65536 -global-offset -long-vector -module-constants-in-storage-buffer -cl-arm-non-uniform-work-group-size -enable-printf -printf-buffer-size=1048576

@rjodinchr
Copy link
Contributor

We only need to update clspv to fix that issue now.

@rjodinchr
Copy link
Contributor

Should be fixed now

@hakanrw
Copy link
Author

hakanrw commented Nov 17, 2024

thank you for your interest in my bug report. i appreciate the ongoing work being done.

i compiled the latest main branch, however the issue persists with the exact same error on my machine.

clspv version output:

LLVM (http://llvm.org/):
  LLVM version 19.0.0git                                            Optimized build.

@rjodinchr
Copy link
Contributor

I've just checked, and this is working on my side.

Are you sure you have updated the submodules, and update clspv dependencies?

@hakanrw
Copy link
Author

hakanrw commented Nov 18, 2024

my apologies, you are completely right! i had forgotten to update the submodules - i have taken care of it now.

the opencl kernel compiled without issues! this time, though, i'm getting a strange runtime behaviour where the code i provided modifies the string as if it was an integer array (instead of incrementing the memory address by one for each index, it increments by four).

to illustrate, here’s what happens when i iterate over the string:

$ ./hello_world | xxd

00000000: 4800 0000 6500 0000 6c00 0000 6c00 0000  H...e...l...l...
00000010: 6f00 0000 2000 0000 5700 0000 6f00 0000  o... ...W...o...
00000020: 7200 0000 6c00 0000 6400 0000 2100 0000  r...l...d...!...
00000030: 0a00 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0a                                       .

notice how only every fourth element is modified, rather than a continuous string.

i suspect this might be an aarch64-specific issue. can you give my modified example a try on your system? i attached my own output into output.txt for comparison.

thanks and regards

@rjodinchr
Copy link
Contributor

rjodinchr commented Nov 19, 2024

This is definitively a bug in clspv.

It is translating the following:

  %0 = call ptr addrspace(1) @_Z14clspv.resource.0(i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, { [0 x i8] } zeroinitializer)
  %1 = getelementptr { [0 x i8] }, ptr addrspace(1) %0, i32 0, i32 0, i32 0
  store i8 72, ptr addrspace(1) %1, align 1
  %2 = getelementptr { [0 x i8] }, ptr addrspace(1) %0, i32 0, i32 0, i32 1
  store i8 101, ptr addrspace(1) %2, align 1

into:

         %19 = OpAccessChain %_ptr_StorageBuffer_uint %12 %uint_0 %uint_0
               OpStore %19 %uint_72 Aligned 1
         %22 = OpAccessChain %_ptr_StorageBuffer_uint %12 %uint_0 %uint_1
               OpStore %22 %uint_101 Aligned 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants