bb_segsort (segmented sort): Fast Segmented Sort on GPUs

This repository provides a fast segmented sort on NVIDIA GPUs. The library contains many parallel kernels for different types of segments. In particular, the kernels for solving short/medium segments are automatically generated to efficiently utilize registers in GPUs. More details about the kernels and code generation can be found in the original paper.

Original Work

Original GitHub repository
Contact Email: [email protected]

Improvements in this fork

Added key only version
Asynchronous execution using a single CUDA stream inside bb_segsort_run
No temporary memory allocation inside bb_segsort_run
Reduced memory overhead
Two dimensional kernel grid to avoid index calculations
Avoiding boundaries check by using one-past-the-end offset
No dependency on Thrust

Interface differences

This version expects two offset arrays, one for begin and one for (one-past-the-)end offsets of the segments
You can use a single array and pass offsets and offsets+1 if the segments are densly packed (end of a segment is begin of next segment). Be sure to include the last one-past-the-end offset.

Usage

To use the segmented sort (bb_segsort), you need to include the bb_segsort.cuh (key-value) or bb_segsort_keys.cuh (key only). Use bb_segsort(...) if you don't care about memory allocation or asynchronous execution, or use bb_segsort_run(...) and provide your own memory allocation and stream.

Note, bb_segsort utilizes an unstable sorting network as the building block; thus, equivalent elements are not guaranteed to keep the original relative order.

Example

main.cu contains an example of how to use (bb_segsort). Adapt the Makefile to fit your system. Especially, you may need to change the ARCH according to your GPU platform. For example, if you are using a P100 GPU, you should update ARCH to 61.

Compile using make:

$ make

After compilation, run the executable:

$ ./main.out

License

Please refer to the included LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
src		src
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
main.cu		main.cu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bb_segsort (segmented sort): Fast Segmented Sort on GPUs

Original Work

Improvements in this fork

Interface differences

Usage

Example

License

About

Releases

Packages

Languages

License

Funatiq/bb_segsort

Folders and files

Latest commit

History

Repository files navigation

bb_segsort (segmented sort): Fast Segmented Sort on GPUs

Original Work

Improvements in this fork

Interface differences

Usage

Example

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages