Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typo in the addition kernel in muladd.cu #100

Open
ferrarioa5 opened this issue Sep 7, 2024 · 1 comment
Open

Typo in the addition kernel in muladd.cu #100

ferrarioa5 opened this issue Sep 7, 2024 · 1 comment

Comments

@ferrarioa5
Copy link

There is a typo in the add_kernel routine in the CUDA file muladd.cu. I assumed that this kernel should compute the sum of two tensors, but it acually computes the multiplication:

__global__ void add_kernel(int numel, const float* a, const float* b, float* result) {
  int idx = blockIdx.x * blockDim.x + threadIdx.x;
  if (idx < numel) result[idx] = a[idx] * b[idx];
}

This bug can be tested by running the following python script:

import extension_cpp as ext
import torch

device = 'cuda'
n=100
a = torch.rand(n).to(device)
b = torch.rand(n).to(device)
add2 = torch.zeros(n).to(device)

add1=a+b
ext.ops.myadd_out(a,b,add2)
print(torch.equal(add1,add2))

The CPU implementation gives the correct result (with device="cpu" in the code above).

@cyk2018
Copy link

cyk2018 commented Oct 1, 2024

yes, I also hink this is an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants