Skip to content

Commit

Permalink
GH-34569: [C++] Diffing of Run-End Encoded arrays (#35003)
Browse files Browse the repository at this point in the history
### What changes are included in this PR?

Ability to diff run-end encoded arrays without expanding them first. This is useful for debugging and understanding of unit test failures.

Many successive random accesses to REE arrays by logical indices incur many successive binary searches on the run-ends array to find the physical index of the values array, so this PR implements a few interesting optimizations -- the most complicated one is encapsulated in a new utility available via `ree_util.h` as there are many potential uses for it other than diffing.

    +-------------+---------------------------+----------------------------------------------------+
    |             |                                  Optimizations                                 |
    +-------------+---------------------------+-----------------------+----------------------------+
    | # bsearches | Null checks on Comparator | RunLengthOfEqualsFrom | Cached FindPhysicalIndex() |
    +-------------+---------------------------+----------------------+-----------------------------+
    |    3798     |                           |                      |                             |
    +-------------+---------------------------+----------------------+-----------------------------+
    |    1914     |           X               |                      |                             |
    +-------------+---------------------------+----------------------+-----------------------------+
    |     180     |           X               |        X             |                             |
    +-------------+---------------------------+----------------------+-----------------------------+
    |     153     |           X               |                      |           X                 |
    +-------------+---------------------------+----------------------+-----------------------------+
    |      75     |           X               |        X             |           X                 |
    +-------------+---------------------------+----------------------+-----------------------------+
    
### Are these changes tested?

Yes. By existing and newly added REE-specific tests.

### Are there any user-facing changes?

No.

* Closes: #34569

Authored-by: Felipe Oliveira Carvalho <[email protected]>
Signed-off-by: Benjamin Kietzman <[email protected]>
  • Loading branch information
felipecrv authored Oct 26, 2023
1 parent f622139 commit fcbcb7d
Show file tree
Hide file tree
Showing 4 changed files with 491 additions and 59 deletions.
Loading

0 comments on commit fcbcb7d

Please sign in to comment.