-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better long and float column compression #4080
Comments
Also a good timeseries compression algorithm is described here: http://www.vldb.org/pvldb/vol8/p1816-teller.pdf (Gorilla paper) |
I ran some benchmarks on a Druid column format based on JavaFastPFOR today and got the following sizes and timings. The dataset was ints with randomness uniformly distributed over some number of bits, and the other bits left as zero (emulating string columns with varying cardinalities). I did three scans: at 10%, 95%, and 100% selectivity. Some notes:
Follow ups needed:
|
/cc @lemire (as fyi; not that I am demanding anything of you 😄) |
Btw we already have implemented a better long (int64) compression about a year ago, although it is still off by default. It uses fixed length bit packing so it supports relatively random access. Maybe it's time to turn it on by default and disable LZ4 by default. Some sizings and timings are in https://imply.io/post/2016/12/07/compressing-longs.html. (Here, we did test more distributions, and found that for uniform distributions, bit packing of deltas compresses smaller than lz4; but for power law distributions lz4 is smaller. In all cases, bit packing is fastest, since the old implementation of int64 columns didn't even do byte packing). |
I'm available and interested in helping out. Note: This could end up generating an interesting paper if you guys care about documenting the work. I'd love to help make this happen. email: [email protected] |
@lemire Thanks for the offer. At this point I have one main question: do you know if your JavaFastPFOR library is still the best freely-licensed, small-ish-integer compression library for Java that you are aware of? (I'm looking to compress the offset section of string columns; in Druid we expect these to be positive ints with the max value being the cardinality of the column -- which is never more than a few million and is often less.) Another note: currently Druid can access values in int columns randomly, since we pack into a fixed number of bytes per value (whatever is needed based on the max value for the overall column). Random access isn't supported by JavaFastPFOR, and I suspect that contributes to its relative slowness on the 10% selectivity test. I would like to run some tests with fixed-length bit packing (which could support random access, or at least much smaller blocks) to see how it fares on various selectivities vs. FastPFOR. |
I edited the original comment to include BinaryPacking too (the paper says it's fixed-width bit packing). They look similar to FastPFOR at the current block sizes I'm testing. |
I'm not aware of anything else that is as good... ;-) But I did not join the conversation to promote JavaFastPFOR specifically. We wrote JavaFastPFOR so that people could have access to a collection of tricks that work well. But what the Parquet folks did was to grab whatever code they needed from JavaFastPFOR and adapt it... and I am totally fine with that!
Ah. Ah. It is true that we have not implemented random access in JavaFastPFOR, but we did in Upscaledb (also open source, but not in Java): Daniel Lemire, Christoph Rupp, Upscaledb: Efficient Integer-Key Compression in a Key-Value Store using SIMD Instructions, Information Systems Volume 66, June 2017, Pages 13-23 https://lemire.me/en/publication/arxiv1611.05428/ You'll say: "Wait Daniel! We don't care about SIMD Instruction". And I'll say: look again. In that paper, the best performance was with SIMD-based binary packing, but we got very good speed with straight binary packing which ought to be quite fast (even in Java). We have C++ code there that does things like fast select... Doing the equivalent in Java is not hard. By "not hard", I mean that if there is a potential market for it, I'll volunteer to contribute code. And be happy to do so. (And that means, producing open source stuff, naturally.)
I favor binary packing given a choice because it is really fun to engineer. Meaning that you can easily write special-purpose functions that act directly on the compressed data without uncompressing it. See above paper. |
In case you don't know about Upscaledb: https://upscaledb.com cc @cruppstahl |
@gianm Am also wondering if we are taking advantage of the fact that the array of ints is sorted with known min and max, I guess that should help. |
@b-slim Binary packing definitely takes advantage of the fact that the mins and maxes are known. I didn't pick encodings that take advantage of sortedness, however, because in general we can only expect at most one dimension's values to be sorted (the first one, and only if segmentGranularity = queryGranularity). In some cases this does happen and would be a useful optimization, but I wasn't worrying about it at this point.
|
This issue has been marked as stale due to 280 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the [email protected] list. Thank you for your contributions. |
Again... no interest in this? The potential is huge. |
This issue is no longer marked as stale. |
@lemire I am still very interested in pushing forward on this, and I agree with you on the implications. I've been stalling on making any progress on this issue or with my experiments until #6794 is merged, mainly because vectorization of the rest of the query processing engine changes the dynamics a bit at the compression layer and I want to make sure I am also targeting that optimally since I think it's probably the future of the query engine. But as soon as it goes in, I definitely plan to pick this back up! |
Ok. Feel free to get in touch with me when you get to it, if you want. |
@lemire @clintropolis - I'm interested :) and #6794 is now merged also. Is there still an apitite to look into this? |
@CmdrKeen I can't lead this, but I sure could help with pointers and reviews. |
Hi @CmdrKeen, #6016 captures much of my experiments and findings into this so far. For integer compression of dictionary encoded string columns, FastPFOR is a very solid improvement in many cases, but was not always better across the board for some value distributions. However the strategy taken in #6016, which uses it as part of an approach that attempts to find the "best" encoding for a given set of values at ingestion time, I feel has been at least proven as viable in that PR. The main stalling point I ran into was when I wired up the C++ implementation of FastPFOR to Druid through JNI and ran into the situation where the C++ and Java implementations are not actually compatible (and don't claim to be, I was just experimenting to see if I could make it go faster). The performance increase of the C++ version make it worth using, but it is also necessary to have a fallback to pure java for cases when the compiled native version is not available for a platform. I also think the 'native' version of my branch in that PR is a bit cleaner of an implementation since it deals in bytebuffer/memory locations rather than on heap arrays like the branch associated with #6016 itself (see clintropolis/druid@shapeshift...clintropolis:shapeshift-native for the jni branch differences). I've been intending to pick this back up for like... over a year now, but I just haven't been able to prioritize it to bring it to the finish line. I did "dust off" the branches and start testing stuff a few months ago, but haven't pushed my changes to fix conflicts and to add vectorization support after #6794 or really had the chance to make any material progress on finishing it. I think the next steps are creating a version of FastPFOR in pure Java that is compatible with the C++ version, that preferably reads to/writes from little endian In fast-pack/FastPFor#42, @lemire has suggested that making a compatible version is non-trivial but might not take more than a few days or a week or so. My problem has been finding a continuous chunk of time to devote to fully understanding the C++ implementation and then of course doing the compatible Java implementation. Much of the benchmarking will need repeated against the latest version of Druid, with a more heavy focus on vectorized reads, though I was already benchmarking against #6794 while it was in progress and my branch was competitive at the time, so many of the measurements might still be close to true. The other part that has given me some pause more recently, is thinking about how potentially more of the low level column reads of Druid could be pushed into native code, beyond just decompressing chunks of values, and what that might look like and how that might change implementation of what has been done in the prototype thus far (if at all). I expect the strategy and physical column layout itself wouldn't really change, just a lot more of the code implemented in java right now would be pushed to native code, so this might not be worth worrying too much about. Lastly, for some further thinking with regards to native processing, I think there are probably some unresolved questions about whether the new column format suggested in my PR (or any larger segment format) changes would be worth further modifying so that data in the segment would be well aligned to work with vectorized CPU instructions. This might not be an issue worth considering at all, I mostly want to make sure because segment format changes are among the riskiest changes to make. Once released they must remain supported for backwards compatibility for very long cycles, so it feels very important to get any new stuff right. Anyway, thanks for bringing this issue back to my attention and starting the discussion. Thinking about it to respond to your question has given me some motivation to .. well at minimum at least keep thinking about it a bit more, but maybe we can actually get this moving again 😅. |
For what it's worth, I appreciated the wall of text @clintropolis :) It helps provide insight! No pressure, just always excited by efficiency gains! 👍 |
This issue has been marked as stale due to 280 days of inactivity. |
This issue has been closed due to lack of activity. If you think that |
The idea is brought here: #4027 (comment)
https://github.com/lemire/JavaFastPFOR
The text was updated successfully, but these errors were encountered: