Improve the performance of "DictionaryValue" row encoding #4712
Labels
arrow
Changes to the arrow crate
arrow-flight
Changes to the arrow-flight crate
enhancement
Any new improvement worthy of a entry in the changelog
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We are consider using the "don't use dictionary interning" in DataFusion for high cardinality columns: apache/datafusion#7200 (comment)
@tustvold mentioned this mode could be made faster
Describe the solution you'd like
Review and optimize
Code::DictionaryValues
https://github.com/apache/arrow-rs/blob/b810e8f207bbc70294b01acba4be32153c18a6ab/arrow-row/src/lib.rs#L437C14-L437C14
Perhaps this could be made faster:
arrow-rs/arrow-row/src/lib.rs
Line 1417 in b810e8f
(I am not sure)
Describe alternatives you've considered
There may not be any way to make this faster, but I wanted to file the ticket as follow on to the meeting
Additional context
The text was updated successfully, but these errors were encountered: