Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-34535: [C++] Move ChunkResolver to the public API #44357

Merged
merged 11 commits into from
Oct 21, 2024

Conversation

anjakefala
Copy link
Collaborator

@anjakefala anjakefala commented Oct 9, 2024

Rationale for this change

Adopting #40226.

The creation and return of a shared_ptr does result in some performance overhead, that makes a difference for a performance-sensitive application.

If someone could use ChunkResolver to learn the indices, they could then instead access the data directly.

What changes are included in this PR?

  • Updates to documentation (thanks to @SChakravorti21 )
  • Moving ChunkResolver to public API, and updating all references to it in the code

Are these changes tested?

There seemed to be comprehensive tests already: https://github.com/apache/arrow/blob/main/cpp/src/arrow/chunked_array_test.cc#L324 If an edgecase is missing, I'd be happy to add it.

Are there any user-facing changes?

ChunkResolver and TypedChunkLocation are now in the public API.

@anjakefala anjakefala marked this pull request as draft October 9, 2024 21:22
@anjakefala
Copy link
Collaborator Author

I'm aware of the build failures and actively resolving them!

@anjakefala
Copy link
Collaborator Author

For the failing R builds, this conversation needs to be resolved: #43623

@anjakefala anjakefala force-pushed the chunkresolver branch 2 times, most recently from 375ef57 to 994ffc8 Compare October 10, 2024 18:01
@anjakefala anjakefala marked this pull request as ready for review October 11, 2024 00:22
cpp/src/arrow/chunk_resolver.h Show resolved Hide resolved
cpp/src/arrow/chunk_resolver.h Outdated Show resolved Hide resolved
cpp/src/arrow/chunk_resolver_benchmark.cc Outdated Show resolved Hide resolved
cpp/src/arrow/chunked_array_test.cc Outdated Show resolved Hide resolved
cpp/src/arrow/compute/kernels/vector_sort.cc Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Oct 12, 2024
@felipecrv
Copy link
Contributor

felipecrv commented Oct 12, 2024

Yes, I agree this API should be made public. I've added tests and benchmarks recently [1].

[1] #43954

@github-actions github-actions bot removed the awaiting changes Awaiting changes label Oct 15, 2024
r/src/altrep.cpp Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting change review Awaiting change review labels Oct 17, 2024
@anjakefala
Copy link
Collaborator Author

anjakefala commented Oct 18, 2024

Please add a Co-authored-by: SChakravorti21 <[email protected]> in the squashed commit. I tried to add it, but I seem to had a typo, and it didn't register.

(Edit: Turns out PRs are merged via a script, and not a manual squash.)

@anjakefala
Copy link
Collaborator Author

I amended @SChakravorti21 as an author in a recent commit, so that the merge script will pick them up as a co-author.

…_job

Guard ChunkResolver use in altrep.cpp
@anjakefala anjakefala requested a review from amoeba October 18, 2024 16:51
r/src/altrep.cpp Outdated Show resolved Hide resolved
Co-authored-by: Jacob Wujciak-Jens <[email protected]>
@anjakefala
Copy link
Collaborator Author

anjakefala commented Oct 21, 2024

@assignUser The Java failure seems to be a network error! Is there a chance of merging this for the release?

@assignUser
Copy link
Member

@anjakefala I'll re-run the job. This was not marked as a blocker and with rc0 already cut I will defer to @raulcd if we can add this to rc1 (if it will happen).

@assignUser assignUser requested a review from lidavidm October 21, 2024 22:11
@assignUser
Copy link
Member

There seem to be some actual java test errors now, I don't think they are related but am not ☕ to judge that.
https://github.com/apache/arrow/actions/runs/11445780195/job/31852155160?pr=44357#step:7:10025

@felipecrv
Copy link
Contributor

Is anyone against me merging this one?

@felipecrv felipecrv merged commit 8208774 into apache:main Oct 21, 2024
41 of 42 checks passed
@felipecrv felipecrv removed the awaiting merge Awaiting merge label Oct 21, 2024
@felipecrv
Copy link
Contributor

@anjakefala fix version was set to 19.0.0. You might want to change it on the issue to 18.0.0.

@anjakefala anjakefala deleted the chunkresolver branch October 22, 2024 04:36
Copy link

After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit 8208774.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 15 possible false positives for unstable benchmarks that are known to sometimes produce them.

amoeba added a commit that referenced this pull request Nov 11, 2024
### Rationale for this change

Adopting #40226. 

 The creation and return of a shared_ptr does result in some performance overhead, that makes a difference for a performance-sensitive application.

If someone could use ChunkResolver to learn the indices, they could then instead access the data directly. 

### What changes are included in this PR?

- [X] Updates to documentation (thanks to @ SChakravorti21 )
- [X] Moving `ChunkResolver` to public API, and updating all references to it in the code

### Are these changes tested?

There seemed to be comprehensive tests already: https://github.com/apache/arrow/blob/main/cpp/src/arrow/chunked_array_test.cc#L324 If an edgecase is missing, I'd be happy to add it.

### Are there any user-facing changes?

`ChunkResolver` and `TypedChunkLocation` are now in the public API.
* GitHub Issue: #34535

Lead-authored-by: Anja Kefala <[email protected]>
Co-authored-by: anjakefala <[email protected]>
Co-authored-by: Bryce Mecum <[email protected]>
Co-authored-by: SChakravorti21 <[email protected]>
Co-authored-by: SChakravorti21<[email protected]>
Co-authored-by: Jacob Wujciak-Jens <[email protected]>
Signed-off-by: Felipe Oliveira Carvalho <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants