Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process parquet bools with microkernels #17157

Open
wants to merge 65 commits into
base: branch-24.12
Choose a base branch
from

Conversation

pmattione-nvidia
Copy link
Contributor

@pmattione-nvidia pmattione-nvidia commented Oct 23, 2024

This adds support for the bool type to reading parquet microkernels. Both plain (bit-packed) and RLE-encoded bool decode is supported, using separate code paths. This PR also massively reduces boilerplate code, as most of the template info needed is already encoded in the kernel mask. Also the superfluous level_t template parameter on rle_run has been removed. And bools have been added to the parquet benchmarks.

Performance: register count drops from 62 -> 56, both plain and RLE-encoded bool decoding are now 46% faster (uncompressed). Reading sample customer data shows no change. NDS tests show no change.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

pmattione-nvidia and others added 30 commits August 12, 2024 16:31
@pmattione-nvidia pmattione-nvidia added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 23, 2024
@pmattione-nvidia pmattione-nvidia self-assigned this Oct 23, 2024
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Oct 23, 2024
@pmattione-nvidia pmattione-nvidia marked this pull request as ready for review October 29, 2024 22:17
@pmattione-nvidia pmattione-nvidia requested a review from a team as a code owner October 29, 2024 22:17
Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Only some questions/nits

cpp/benchmarks/io/nvbench_helpers.hpp Show resolved Hide resolved
cpp/src/io/parquet/reader_impl.cpp Show resolved Hide resolved
cpp/src/io/parquet/decode_fixed.cu Show resolved Hide resolved
cpp/src/io/parquet/decode_fixed.cu Outdated Show resolved Hide resolved
Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

cpp/src/io/parquet/decode_fixed.cu Show resolved Hide resolved
Copy link
Contributor

@nvdbaranec nvdbaranec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I like how the meat of the implementation is so small. The generic kernel continues to dissolve :)

cpp/src/io/parquet/parquet_gpu.hpp Show resolved Hide resolved
cpp/src/io/parquet/decode_fixed.cu Show resolved Hide resolved
cpp/src/io/parquet/decode_fixed.cu Outdated Show resolved Hide resolved
Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great stuff!

int s_idx = 0;

auto decode_data = [&](decode_kernel_mask decoder_mask) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Performance Performance related issue
Projects
Status: In Progress
Status: Burndown
Development

Successfully merging this pull request may close these issues.

4 participants