Skip to content

Funatiq/bb_segsort

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bb_segsort (segmented sort): Fast Segmented Sort on GPUs

This repository provides a fast segmented sort on NVIDIA GPUs. The library contains many parallel kernels for different types of segments. In particular, the kernels for solving short/medium segments are automatically generated to efficiently utilize registers in GPUs. More details about the kernels and code generation can be found in the original paper.

Original Work

Improvements in this fork

  • Added key only version
  • Asynchronous execution using a single CUDA stream inside bb_segsort_run
  • No temporary memory allocation inside bb_segsort_run
  • Reduced memory overhead
  • Two dimensional kernel grid to avoid index calculations
  • Avoiding boundaries check by using one-past-the-end offset
  • No dependency on Thrust

Interface differences

  • This version expects two offset arrays, one for begin and one for (one-past-the-)end offsets of the segments
  • You can use a single array and pass offsets and offsets+1 if the segments are densly packed (end of a segment is begin of next segment). Be sure to include the last one-past-the-end offset.

Usage

To use the segmented sort (bb_segsort), you need to include the bb_segsort.cuh (key-value) or bb_segsort_keys.cuh (key only). Use bb_segsort(...) if you don't care about memory allocation or asynchronous execution, or use bb_segsort_run(...) and provide your own memory allocation and stream.

Note, bb_segsort utilizes an unstable sorting network as the building block; thus, equivalent elements are not guaranteed to keep the original relative order.

Example

main.cu contains an example of how to use (bb_segsort). Adapt the Makefile to fit your system. Especially, you may need to change the ARCH according to your GPU platform. For example, if you are using a P100 GPU, you should update ARCH to 61.

Compile using make:

$ make

After compilation, run the executable:

$ ./main.out

License

Please refer to the included LICENSE file.

About

No description or website provided.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Cuda 99.9%
  • Makefile 0.1%