Consider an official CUB-like library for Slang #4366

natevm · 2024-06-12T22:51:44Z

natevm
Jun 12, 2024
Collaborator

In the world of high performance GPU computing, we need high performance reference implementations of sorts, histograms, prefix sums, and so on. One of the reasons why CUDA is so successful is that NVIDIA provides "CUB", which has thoughly tested implementations of all of these compute operations.

We often need these operations for realtime ray tracing and hyperscale graphics. Prefix sums and Radix Sorters are critical for custom tree constructions, clustering elements, and high performance mesh processing (eg merging common vertices to compute normals, silhouettes, generating new vertices, complementary physics modeling)

With quite a bit of work, I've been able to reproduce most all of the common CUB operations in Slang, by porting over @b0nes164's implementations of the OneSweep sorting algorithm and the Decoupled-Lookback scan implementation:

The OneSweep implementation is here:
https://github.com/b0nes164/GPUSorting.git

And then the Scan implementation is here:
https://github.com/b0nes164/GPUPrefixSums

Note that with scan, extending to support CUBs "partition" and "select" operations requires very minimal changes to the very end of the scan operation, which I've actually done myself before with very little additional code.

Still, many users coming from CUDA to Slang hit this roadblock, that Slang has nearly the same intrinsics for compute operations as CUDA does, and also has many benefits over CUDA too---but that Slang is fundamentally lacking a library like CUB.

Heck, even AMD has a sorter implementation, in their FidelityFX SDK : https://github.com/GPUOpen-LibrariesAndSDKs/FidelityFX-SDK/tree/main I don't think this implementation is nearly as fast as the one done by b0nes, and it's more meant specifically for AMD hardware, but my point is that there's a legitimate need for these things.

So, this seems like something that, albeit with some initial effort, could be done in a more official capacity by NVIDIA, and would add a very large value to the Slang ecosystem. I also feel like Slang's advanced language features could be really put to the test and proven out by a library like this. And so I think this is something we should seriously consider doing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider an official CUB-like library for Slang #4366

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Consider an official CUB-like library for Slang #4366

natevm Jun 12, 2024 Collaborator

Replies: 0 comments

natevm
Jun 12, 2024
Collaborator