Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-out Kernel #865

Open
kab163 opened this issue Jan 18, 2024 · 2 comments
Open

Zero-out Kernel #865

kab163 opened this issue Jan 18, 2024 · 2 comments

Comments

@kab163
Copy link
Contributor

kab163 commented Jan 18, 2024

Is your feature request related to a problem? Please describe.

Need to zero-out a large array of GPU memory ("fast" way to zero out device memory).

Describe the solution you'd like

Want to allocate a large array and zero it out. We could call malloc_zero_out_kernel(nbytes); or something like that instead of having to write our own kernel to zero it out. Have a built-in umpire function to do that.

Describe alternatives you've considered

Using the resource manager to do a memset takes too long. Allocating an array and then launching a kernel to zero out memory could work, but that adds more code.

Additional context

See teams conversation here.

@kab163
Copy link
Contributor Author

kab163 commented Jan 19, 2024

Another idea is to not just have zero as the value to set a range of memory to, but any value (or at least -1 and nan.. maybe others)

@kab163
Copy link
Contributor Author

kab163 commented Oct 10, 2024

Should do a benchmark that (in a loop) does device mem allocations + cuda memset to zero vs. doing that in a "zero-out kernel" in umpire

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant