-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-38042: [C++][Benchmark] Add non-stream Codec Compression/Decompression #38067
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
BENCHMARK_TEMPLATE(ReferenceCompression, Compression::LZ4_FRAME); | ||
BENCHMARK_TEMPLATE(ReferenceStreamingDecompression, Compression::LZ4_FRAME); | ||
BENCHMARK_TEMPLATE(ReferenceDecompression, Compression::LZ4_FRAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is LZ4_FRAME
OK?
It seems that Parquet doesn't use LZ4_FRAME
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can even benchmark both LZ4 variants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that Parquet doesn't use LZ4_FRAME
Aha I remember parquet-mr first implement LZ4. And arrow implement a different version ( LZ4_FRAME ). LZ4
stores an extra-length here.
Maybe apache/parquet-format#168 helps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I don't think they have too many differences...
Currently I didn't add LZ4
. But feel free to add if neccesssary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
So we could have added LZ4 and Snappy here. @mapleFU Would you like to do that as a followup PR? |
Let me rush it :-) (Just curiously, is it related to #38389 ) ? |
It's just reasonable to benchmark all available codecs, not a subset of them. |
…ompression (apache#38067) ### Rationale for this change Currently, we will enable compression benchmark with ARROW_WITH_BENCHMARKS_REFERENCE Note that it only has benchmark for compressor ( make by Codec::MakeCompressor() ) and decompressor ( make by Codec::MakeDecompressor ). However, Parquet uses Codec to encode and decode. So, I'd like to add benchmarks that use Codec directly. ### What changes are included in this PR? Add benchmark for direct compression and decompression ### Are these changes tested? no need ### Are there any user-facing changes? no * Closes: apache#38042 Authored-by: mwish <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit 3be5e60. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 1 possible false positive for unstable benchmarks that are known to sometimes produce them. |
…ompression (apache#38067) ### Rationale for this change Currently, we will enable compression benchmark with ARROW_WITH_BENCHMARKS_REFERENCE Note that it only has benchmark for compressor ( make by Codec::MakeCompressor() ) and decompressor ( make by Codec::MakeDecompressor ). However, Parquet uses Codec to encode and decode. So, I'd like to add benchmarks that use Codec directly. ### What changes are included in this PR? Add benchmark for direct compression and decompression ### Are these changes tested? no need ### Are there any user-facing changes? no * Closes: apache#38042 Authored-by: mwish <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…ompression (apache#38067) ### Rationale for this change Currently, we will enable compression benchmark with ARROW_WITH_BENCHMARKS_REFERENCE Note that it only has benchmark for compressor ( make by Codec::MakeCompressor() ) and decompressor ( make by Codec::MakeDecompressor ). However, Parquet uses Codec to encode and decode. So, I'd like to add benchmarks that use Codec directly. ### What changes are included in this PR? Add benchmark for direct compression and decompression ### Are these changes tested? no need ### Are there any user-facing changes? no * Closes: apache#38042 Authored-by: mwish <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Rationale for this change
Currently, we will enable compression benchmark with ARROW_WITH_BENCHMARKS_REFERENCE
Note that it only has benchmark for compressor ( make by Codec::MakeCompressor() ) and decompressor ( make by Codec::MakeDecompressor ). However, Parquet uses Codec to encode and decode. So, I'd like to add benchmarks that use Codec directly.
What changes are included in this PR?
Add benchmark for direct compression and decompression
Are these changes tested?
no need
Are there any user-facing changes?
no
Codec
Compression/Decompression cases #38042