Process parquet bools with microkernels #17157

pmattione-nvidia · 2024-10-23T19:32:27Z

This adds support for the bool type to reading parquet microkernels. Both plain (bit-packed) and RLE-encoded bool decode is supported, using separate code paths. This PR also massively reduces boilerplate code, as most of the template info needed is already encoded in the kernel mask. Also the superfluous level_t template parameter on rle_run has been removed. And bools have been added to the parquet benchmarks.

Performance: register count drops from 62 -> 56, both plain and RLE-encoded bool decoding are now 46% faster (uncompressed). Reading sample customer data shows no change. NDS tests show no change.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

…lable column code

…attione-nvidia/cudf into mukernels_fixedwidth_optimize

Co-authored-by: nvdbaranec <[email protected]>

…attione-nvidia/cudf into mukernels_fixedwidth_optimize

Co-authored-by: Vukasin Milovanovic <[email protected]>

…attione-nvidia/cudf into mukernels_fixedwidth_optimize

…els_bools

PointKernel

Looks great. Only some questions/nits

cpp/benchmarks/io/nvbench_helpers.hpp

cpp/src/io/parquet/reader_impl.cpp

cpp/src/io/parquet/decode_fixed.cu

Co-authored-by: Yunsong Wang <[email protected]>

PointKernel

LGTM

cpp/src/io/parquet/decode_fixed.cu

nvdbaranec

Looks great. I like how the meat of the implementation is so small. The generic kernel continues to dissolve :)

cpp/src/io/parquet/parquet_gpu.hpp

cpp/src/io/parquet/decode_fixed.cu

Co-authored-by: nvdbaranec <[email protected]>

vuule

great stuff!

vuule · 2024-11-06T23:42:11Z

cpp/src/io/parquet/reader_impl.cpp

  int s_idx = 0;
+
+  auto decode_data = [&](decode_kernel_mask decoder_mask) {


pmattione-nvidia and others added 30 commits August 12, 2024 16:31

work in progress

b5ec22e

Further work in list code

2ca9618

Tests working

4b5f91a

Revert page_decode changes

ead17b8

Merge branch 'branch-24.10' into parquet_list_kernel

cc32409

Add debugging

0dccec5

Tests working

e239e79

Merge branch 'branch-24.10' into parquet_list_kernel

8f25453

compile fixes

24c9ab1

No need to decode def levels if not nullable

342c2f4

Manual block scan

50bbc94

Optimize parquet reader block scans, simplify and consolidate non-nul…

5390661

…lable column code

tweak syncing

3ef7b0d

small tweaks

7882879

Merge branch 'branch-24.10' into parquet_list_kernel

8852839

Add skipping to rle_stream, use for lists (chunked reads)

e285fbf

tweak scan interface for linked lists

254f3e9

Merge branch 'branch-24.12' into mukernels_fixedwidth_optimize

18d989c

style fixes

8ea1e0e

Merge branch 'mukernels_fixedwidth_optimize' of https://github.com/pm…

326b386

…attione-nvidia/cudf into mukernels_fixedwidth_optimize

Update cpp/src/io/parquet/decode_fixed.cu

41cb982

Co-authored-by: nvdbaranec <[email protected]>

Update cpp/src/io/parquet/decode_fixed.cu

6e70554

Co-authored-by: nvdbaranec <[email protected]>

Update cpp/src/io/parquet/decode_fixed.cu

9ad4415

Co-authored-by: nvdbaranec <[email protected]>

Unroll block-count loop

3a1fc95

Merge branch 'mukernels_fixedwidth_optimize' of https://github.com/pm…

0babf46

…attione-nvidia/cudf into mukernels_fixedwidth_optimize

more style fixes

5ab9829

Merge branch 'branch-24.12' into mukernels_fixedwidth_optimize

310d50c

Disable manual block scan for non-lists

4471022

Update cpp/src/io/parquet/decode_fixed.cu

c0ed2cb

Co-authored-by: Vukasin Milovanovic <[email protected]>

Merge branch 'mukernels_fixedwidth_optimize' of https://github.com/pm…

c2139ef

…attione-nvidia/cudf into mukernels_fixedwidth_optimize

pmattione-nvidia added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 23, 2024

pmattione-nvidia self-assigned this Oct 23, 2024

github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Oct 23, 2024

pmattione-nvidia and others added 10 commits October 23, 2024 15:44

Merge branch 'branch-24.12' into parquet_list_kernel

a82ae40

style fixes

86b6074

Merge branch 'branch-24.12' into mukernels_bools

99b9f0c

Merge remote-tracking branch 'origin/parquet_list_kernel' into mukern…

db15506

…els_bools

bool list working

d914303

style fixes

4576f89

more style fixes

e6b98c5

remove extra encoding

c9154ef

Merge branch 'branch-24.12' into mukernels_bools

86ade66

fix merge issues

c039805

pmattione-nvidia marked this pull request as ready for review October 29, 2024 22:17

pmattione-nvidia requested a review from a team as a code owner October 29, 2024 22:17

pmattione-nvidia requested review from vyasr, PointKernel, nvdbaranec and vuule October 29, 2024 22:17

pmattione-nvidia added 2 commits October 30, 2024 12:58

Reduce kernel boilerplate with switch

388cdbe

Nuke more boilerplate code

b877ba3

PointKernel reviewed Oct 30, 2024

View reviewed changes

cpp/benchmarks/io/nvbench_helpers.hpp Show resolved Hide resolved

cpp/src/io/parquet/reader_impl.cpp Show resolved Hide resolved

cpp/src/io/parquet/decode_fixed.cu Show resolved Hide resolved

cpp/src/io/parquet/decode_fixed.cu Outdated Show resolved Hide resolved

pmattione-nvidia and others added 2 commits October 31, 2024 11:06

Update cpp/src/io/parquet/decode_fixed.cu

8984cce

Co-authored-by: Yunsong Wang <[email protected]>

fix style

a0a5060

PointKernel approved these changes Nov 1, 2024

View reviewed changes

cpp/src/io/parquet/decode_fixed.cu Show resolved Hide resolved

nvdbaranec approved these changes Nov 6, 2024

View reviewed changes

cpp/src/io/parquet/parquet_gpu.hpp Show resolved Hide resolved

cpp/src/io/parquet/decode_fixed.cu Show resolved Hide resolved

cpp/src/io/parquet/decode_fixed.cu Outdated Show resolved Hide resolved

pmattione-nvidia and others added 2 commits November 6, 2024 13:01

Update cpp/src/io/parquet/decode_fixed.cu

9e46ddd

Co-authored-by: nvdbaranec <[email protected]>

Merge branch 'branch-24.12' into mukernels_bools

5840ceb

vuule approved these changes Nov 6, 2024

View reviewed changes

cpp/src/io/parquet/reader_impl.cpp

int s_idx = 0;

auto decode_data = [&](decode_kernel_mask decoder_mask) {

Copy link

Contributor

vuule Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process parquet bools with microkernels #17157

Process parquet bools with microkernels #17157

pmattione-nvidia commented Oct 23, 2024 •

edited

Loading

PointKernel left a comment

PointKernel left a comment

nvdbaranec left a comment

vuule left a comment

vuule Nov 6, 2024

		int s_idx = 0;

		auto decode_data = [&](decode_kernel_mask decoder_mask) {

Process parquet bools with microkernels #17157

Are you sure you want to change the base?

Process parquet bools with microkernels #17157

Conversation

pmattione-nvidia commented Oct 23, 2024 • edited Loading

Checklist

PointKernel left a comment

Choose a reason for hiding this comment

PointKernel left a comment

Choose a reason for hiding this comment

nvdbaranec left a comment

Choose a reason for hiding this comment

vuule left a comment

Choose a reason for hiding this comment

vuule Nov 6, 2024

Choose a reason for hiding this comment

pmattione-nvidia commented Oct 23, 2024 •

edited

Loading