-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-43254: [C++] Always prefer mimalloc to jemalloc #40875
Conversation
@ursabot please benchmark |
Benchmark runs are scheduled for commit 107f99d. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete. |
This comment was marked as outdated.
This comment was marked as outdated.
Ok, it's not obvious if this actually changed anything in the benchmarking setup, because the benchmark numbers don't seem to show any relevant change in performance (it could also be that our benchmarks are not that sensitive to memory allocation details). cc @austin3dickey FYI |
I'm not exactly sure either. Before my recent changes, we had set ARROW_MIMALLOC=ON and ARROW_JEMALLOC=OFF in the environment: arrow/dev/conbench_envs/benchmarks.env Line 30 in dbedcfc
But we weren't necessarily picking up the environment variables during the archery build. I think both might have been OFF during cmake because that's what archery did by default. Not sure what happens then. I recently made a series of PRs so we know we're picking up those environment variables for the build we're using for benchmarks:
That last PR has yet to be merged but I'm running the final benchmark tests on it now. Those should definitely be running with mimalloc, not jemalloc. |
Benchmark numbers of jemalloc vs. mimalloc should appear in #41205 (comment) |
@github-actions crossbow submit -g cpp |
This comment was marked as outdated.
This comment was marked as outdated.
The benchmarking experiment in #41205 shows many regressions when switching the default allocator from mimalloc to jemalloc in the benchmarking setup. This supports the idea of always using mimalloc as the default allocator, regardless of operating system. |
@github-actions crossbow submit -g cpp |
Revision: cca0edf Submitted crossbow builds: ursacomputing/crossbow @ actions-f6738f965b |
|
@github-actions crossbow submit -g python -g wheel -g linux |
Revision: cca0edf Submitted crossbow builds: ursacomputing/crossbow @ actions-573bff6e17 |
@kou These errors on Redhat-like builds do not seem relevant?
Edit: these errors might be caused by Edit again: this doesn't seem to have fixed the issue. |
Hmm. I haven't seen the error and it's not reproduced on local. |
@github-actions crossbow submit almalinux-*-amd64 |
This comment was marked as outdated.
This comment was marked as outdated.
@github-actions crossbow submit almalinux-*-amd64 |
This comment was marked as outdated.
This comment was marked as outdated.
I got it. Our RPM build script adds a changelog entry based on the latest commit date. The latest changelog entry on main is:
In the RPM build job prepend a new changelog entry with
They aren't sorted. So the error is happen. I've added a commit that we always use "now" in the RPM build job. |
@github-actions crossbow submit almalinux-* amazon-linux-* centos-* |
Revision: 9483626 Submitted crossbow builds: ursacomputing/crossbow @ actions-d4d35d1386 |
Thanks for the fix @kou ! Could you review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Oh, sorry. I forgot to add a review comment...!
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 36fe1da. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Rationale for this change
As discussed on the mailing-list, this PR switches the default memory pool to mimalloc for all platforms. This should have several desirable effects:
Are these changes tested?
Yes, by existing CI configurations.
Are there any user-facing changes?
Behavior should not change. Performance characteristics of some user workloads might improve or regress, but this is something we cannot predict in advance.