Typo in the addition kernel in muladd.cu #100

ferrarioa5 · 2024-09-07T10:05:13Z

There is a typo in the add_kernel routine in the CUDA file muladd.cu. I assumed that this kernel should compute the sum of two tensors, but it acually computes the multiplication:

__global__ void add_kernel(int numel, const float* a, const float* b, float* result) {
  int idx = blockIdx.x * blockDim.x + threadIdx.x;
  if (idx < numel) result[idx] = a[idx] * b[idx];
}

This bug can be tested by running the following python script:

import extension_cpp as ext
import torch

device = 'cuda'
n=100
a = torch.rand(n).to(device)
b = torch.rand(n).to(device)
add2 = torch.zeros(n).to(device)

add1=a+b
ext.ops.myadd_out(a,b,add2)
print(torch.equal(add1,add2))

The CPU implementation gives the correct result (with device="cpu" in the code above).

The text was updated successfully, but these errors were encountered:

cyk2018 · 2024-10-01T17:59:54Z

yes, I also hink this is an error.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Typo in the addition kernel in muladd.cu #100

Typo in the addition kernel in muladd.cu #100

ferrarioa5 commented Sep 7, 2024

cyk2018 commented Oct 1, 2024

Typo in the addition kernel in muladd.cu #100

Typo in the addition kernel in muladd.cu #100

Comments

ferrarioa5 commented Sep 7, 2024

cyk2018 commented Oct 1, 2024