-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Dummy PR to check maint-17.0.0 status #43113
Conversation
…43093) ### Rationale for this change The newer version of LLVM on AlmaLinux 8 fails on the pyarrow.gandiva tests ### What changes are included in this PR? Temporarily remove Gandiva on Python checks for AlmaLinux 8. ### Are these changes tested? Via CI ### Are there any user-facing changes? No * GitHub Issue: #43059 Authored-by: Raúl Cumplido <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format?
or
In the case of PARQUET issues on JIRA the title also supports:
See also: |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
…2199) <!-- Thanks for opening a pull request! If this is your first pull request you can find detailed information on how to contribute here: * [New Contributor's Guide](https://arrow.apache.org/docs/dev/developers/guide/step_by_step/pr_lifecycle.html#reviews-and-merge-of-the-pull-request) * [Contributing Overview](https://arrow.apache.org/docs/dev/developers/overview.html) If this is not a [minor PR](https://github.com/apache/arrow/blob/main/CONTRIBUTING.md#Minor-Fixes). Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the [Openness](http://theapacheway.com/open/#:~:text=Openness%20allows%20new%20users%20the,must%20happen%20in%20the%20open.) of the Apache Arrow project. Then could you also rename the pull request title in the following format? GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY} or MINOR: [${COMPONENT}] ${SUMMARY} In the case of PARQUET issues on JIRA the title also supports: PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY} --> ### Rationale for this change Ensuring that creating IPC payloads works correctly for non-CPU data by utilizing `CopyBufferSliceToCPU`. <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> ### What changes are included in this PR? Adding calls to `CopyBufferSliceToCPU` to the Ipc Writer for base binary types and for list types, to avoid calls to `value_offset` in those cases. <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> ### Are these changes tested? Yes. Tests are added to cuda_test.cc <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ### Are there any user-facing changes? No. <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please uncomment the line below and explain which changes are breaking. --> <!-- **This PR includes breaking changes to public APIs.** --> <!-- Please uncomment the line below (and provide explanation) if the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld). We use this to highlight fixes to issues that may affect users without their knowledge. For this reason, fixing bugs that cause errors don't count, since those are usually obvious. --> <!-- **This PR contains a "Critical Fix".** --> * GitHub Issue: #42198
…43127) ### Rationale for this change We can't use http://mirrorlist.centos.org because CentOS 7 reached EOL. ### What changes are included in this PR? Use https://vault.centos.org/ instead. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: #43122 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Raúl Cumplido <[email protected]>
…e been deprecated (#43121) ### Rationale for this change Jobs are failing to find mirrorlist.centos.org ### What changes are included in this PR? Updating repos based on solution from: #43119 (comment) ### Are these changes tested? Via archery ### Are there any user-facing changes? No * GitHub Issue: #43119 Lead-authored-by: Raúl Cumplido <[email protected]> Co-authored-by: Sutou Kouhei <[email protected]> Co-authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Raúl Cumplido <[email protected]>
This comment was marked as outdated.
This comment was marked as outdated.
… segfault (#43071) ### Rationale for this change See #43070 ### What changes are included in this PR? Checks that the ciphertext length is at least enough to hold the length (if written), nonce and GCM tag for the GCM cipher type. Also enforces that the input ciphertext length parameter is provided (is > 0) and verifies that the ciphertext size read from the file isn't going to cause reads beyond the end of the ciphertext buffer. ### Are these changes tested? Yes I've added new unit tests for this. ### Are there any user-facing changes? No * GitHub Issue: #43070 Authored-by: Adam Reeve <[email protected]> Signed-off-by: mwish <[email protected]>
This comment was marked as outdated.
This comment was marked as outdated.
### Rationale for this change `google_cloud_cpp_mocks` depends on `GTest::gmock_main` but it's built without `BUILD_TESTING`. google-cloud-cpp finds GoogleTest only with `BUILD_TESTING`. ### What changes are included in this PR? The recent google-cloud-cpp doesn't build `google_cloud_cpp_mocks` without `BUILD_TESTING`. Note that we can't use 2.23.0 or later because they can't be built with MinGW-w64. See also: * mingw-w64/mingw-w64#49 * googleapis/google-cloud-cpp#14436 ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. * GitHub Issue: #43134 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
… large memory test (#43128) ### Rationale for this change This test consumes more than 4GB memory and causes oom-kill when running with TSAN as reported in #43116 . ### What changes are included in this PR? Limit its running by marking it as large memory test. ### Are these changes tested? Change is test. ### Are there any user-facing changes? None. * GitHub Issue: #43116 Authored-by: Ruoxi Sun <[email protected]> Signed-off-by: Raúl Cumplido <[email protected]>
pyarrow knows about ARROW_ENABLE_THREADING and doesn't use threads if they are not enabled in libarrow. Split from #37696 * GitHub Issue: #41910 Lead-authored-by: Joe Marshall <[email protected]> Co-authored-by: Joris Van den Bossche <[email protected]> Co-authored-by: Raúl Cumplido <[email protected]> Co-authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
… Stream 8 (#43159) ### Rationale for this change Because json-devel on them don't provide nlohmann/json_fwd.h that is required by google-cloud-cpp. The upstream issue: googleapis/google-cloud-cpp#14438 ### What changes are included in this PR? Use bundled nlohmann/json instead. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: #43158 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
### Rationale for this change This also has a workaround for https://issues.apache.org/jira/browse/ORC-1732 . ### What changes are included in this PR? ORC 2.0.1 has a dependency detection problem. We can't override the detection with ExternalProject but can override the detection with FetchContent. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. * GitHub Issue: #42149 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Raúl Cumplido <[email protected]>
Revision: 12be569 Submitted crossbow builds: ursacomputing/crossbow @ maint-17.0.0-nightly-tests-2 |
Revision: 12be569 Submitted crossbow builds: ursacomputing/crossbow @ maint-17.0.0-nightly-packaging-2 |
Revision: 12be569 Submitted crossbow builds: ursacomputing/crossbow @ maint-17.0.0-nightly-release-1 |
… should not include the release candidate number in the name of the tarball's top-level directory. (#43200) ### Rationale for this change `dev/release/util-create-release-tarball.sh` should not include the release candidate number in the name of the tarball's top-level directory. If the release candidate number is included, the binaries and the release verification tasks fail because the tarball entries have an unexpected folder hierarchy. See #43188 (comment). ### What changes are included in this PR? 1. Modified `dev/release/util-create-release-tarball.sh` to not include the release candidate number in the name of the source directory from which the release tarball is created. ### Are these changes tested? Manually verified this change fixes the bug: ```bash $ dev/release/utils-create-release-tarball.sh 17.0.0 1 $ tar zxvf apache-arrow-17.0.0.tar.gz ... $ ls apache-arrow-17.0.0/ apache-arrow-17.0.0.tar.gz ``` ### Are there any user-facing changes? No * GitHub Issue: #43199 Authored-by: Sarah Gilmore <[email protected]> Signed-off-by: Raúl Cumplido <[email protected]>
…42003) ### Rationale for this change <!-- Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed. Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes. --> This PR is complementary to #41638 . The prior PR reduces reallocations in `PooledBufferWriter`. However the problematic formula it addressed is still used in other functions. In addition to this, `(*PooledBufferWriter).Reserve()` simply doubles the capacity of buffers regardless of its argument `nbytes`. This may result in excessive allocations in some cases. ### What changes are included in this PR? <!-- There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR. --> - Applied the fixed formula to `(*BufferWriter).Reserve()`. - Updated the new capacity passed to `(*memory.Buffer).Reserve()`. - Now using `bitutil.NextPowerOf2(b.pos + nbytes)` to avoid reallocations when adding `nbytes`. - Replaced `math.Max` with `utils.Max` in `(*bufferWriteSeeker).Reserve()` to avoid unnecessary type conversions. ### Are these changes tested? <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> Yes. The following commands pass. ``` $ export PARQUET_TEST_DATA=$PWD/cpp/submodules/parquet-testing/data $ (cd go && go test ./...) ``` ### Are there any user-facing changes? <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> No, but it may reduce the number of allocations and improve the throughput. Before: ``` $ go test -test.run='^$' -test.bench='^BenchmarkWriteColumn$' -benchmem ./parquet/pqarrow/... goos: linux goarch: arm64 pkg: github.com/apache/arrow/go/v17/parquet/pqarrow BenchmarkWriteColumn/int32_not_nullable-10 1190 1016705 ns/op 4125.39 MB/s 5443579 B/op 240 allocs/op BenchmarkWriteColumn/int32_nullable-10 52 24780561 ns/op 169.26 MB/s 12048944 B/op 249 allocs/op BenchmarkWriteColumn/int64_not_nullable-10 632 1717090 ns/op 4885.36 MB/s 5445954 B/op 265 allocs/op BenchmarkWriteColumn/int64_nullable-10 51 22949770 ns/op 365.52 MB/s 12209860 B/op 262 allocs/op BenchmarkWriteColumn/float32_not_nullable-10 519 2234718 ns/op 1876.88 MB/s 5452627 B/op 1263 allocs/op BenchmarkWriteColumn/float32_nullable-10 56 23423793 ns/op 179.06 MB/s 12057540 B/op 1272 allocs/op BenchmarkWriteColumn/float64_not_nullable-10 416 2761247 ns/op 3037.98 MB/s 5507068 B/op 1292 allocs/op BenchmarkWriteColumn/float64_nullable-10 51 25767881 ns/op 325.55 MB/s 12059614 B/op 1285 allocs/op PASS ok github.com/apache/arrow/go/v17/parquet/pqarrow 10.592s ``` After: ``` $ go test -test.run='^$' -test.bench='^BenchmarkWriteColumn$' -benchmem ./parquet/pqarrow/... goos: linux goarch: arm64 pkg: github.com/apache/arrow/go/v17/parquet/pqarrow BenchmarkWriteColumn/int32_not_nullable-10 1196 959528 ns/op 4371.22 MB/s 5420349 B/op 238 allocs/op BenchmarkWriteColumn/int32_nullable-10 51 23017598 ns/op 182.22 MB/s 14138480 B/op 248 allocs/op BenchmarkWriteColumn/int64_not_nullable-10 690 1671710 ns/op 5017.98 MB/s 5419878 B/op 263 allocs/op BenchmarkWriteColumn/int64_nullable-10 50 23196051 ns/op 361.64 MB/s 13728465 B/op 261 allocs/op BenchmarkWriteColumn/float32_not_nullable-10 540 2185075 ns/op 1919.52 MB/s 5459392 B/op 1261 allocs/op BenchmarkWriteColumn/float32_nullable-10 54 21796783 ns/op 192.43 MB/s 14150622 B/op 1271 allocs/op BenchmarkWriteColumn/float64_not_nullable-10 418 2708292 ns/op 3097.38 MB/s 5455095 B/op 1290 allocs/op BenchmarkWriteColumn/float64_nullable-10 51 22174952 ns/op 378.29 MB/s 14142791 B/op 1283 allocs/op PASS ok github.com/apache/arrow/go/v17/parquet/pqarrow 10.210s ``` <!-- If there are any breaking changes to public APIs, please uncomment the line below and explain which changes are breaking. --> <!-- **This PR includes breaking changes to public APIs.** --> <!-- Please uncomment the line below (and provide explanation) if the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld). We use this to highlight fixes to issues that may affect users without their knowledge. For this reason, fixing bugs that cause errors don't count, since those are usually obvious. --> <!-- **This PR contains a "Critical Fix".** --> * GitHub Issue: #41541
…3208) ### Rationale for this change Currently our java-jars and some wheels jobs are failing due to downloading a wrong version of Apache Thrift based on the 0.20.0 branch instead of the tag. That branch contains a new commit that makes the sha validation to fail. ### What changes are included in this PR? Apply the Thrift patch that was applied on vcpkg here: microsoft/vcpkg#39787 ### Are these changes tested? Via archery ### Are there any user-facing changes? No * GitHub Issue: #43204 Authored-by: Raúl Cumplido <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
This was released a long time ago. |
DO NOT MERGE.
This PR is to track the some crossbow jobs to validate status of maintenance branch before creating the first RC.