Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PGO + Bolt #224

Open
zamazan4ik opened this issue Dec 23, 2022 · 5 comments · May be fixed by #970
Open

PGO + Bolt #224

zamazan4ik opened this issue Dec 23, 2022 · 5 comments · May be fixed by #970
Labels
enhancement New feature or request help needed community help is welcome and appreciated

Comments

@zamazan4ik
Copy link

parseable right now does not support building with more advanced optimization techniques like PGO and BOLT. This tooling has an increasing adoption in the community as a tool to additionally optimize programs. With this tooling, there is a huge chance to gain even more performance "for free".

Here I suggest considering an option at least to play with LTO ("fat" version) + PGO + Bolt pipeline (or any combination of them) and test, does it give a performance to the project or not. If yes, would be awesome to have prebuilt binaries with more advanced optimization from the scratch. Also, for the users will be helpful to have the ability to tweak manually their own binaries to their own workloads with the integrated into the build scripts functionality.

Since the project is quite small I do not expect (at least yet) significant improvement from the PGO right now. But at least a good option to try :)

Also, there are some caveats to consider like:

  • Significantly increased build times
  • BOLT could be still unstable (or even broken) on some architectures

Links:

@nitisht
Copy link
Member

nitisht commented Dec 23, 2022

Thank you for the issue @zamazan4ik we’ll review the blog post, looks interesting for sure - but as you said, at our current size it may not be the most useful.

@nitisht nitisht added enhancement New feature or request help needed community help is welcome and appreciated labels Dec 23, 2022
@nitisht
Copy link
Member

nitisht commented Apr 20, 2023

Closing this for now because we don't have cycles to explore this. We may get to this at a later stage

@nitisht nitisht closed this as completed Apr 20, 2023
@zamazan4ik
Copy link
Author

@nitisht I suggest you re-open the issue, so someone in the future can start working on it since the issue will be open.

@nitisht
Copy link
Member

nitisht commented Aug 25, 2023

Thanks @zamazan4ik . Reopening now..

@de-sh
Copy link
Contributor

de-sh commented Oct 22, 2024

I did primary analysis on results with LTO, just tracking times and binary size, here are the results:

main
________________________________________________________
Executed in  291.94 secs    fish           external
   usr time   60.53 mins  474.00 micros   60.53 mins
   sys time    1.63 mins  724.00 micros    1.63 mins

size: 186822568

With Thin LTO
+ [profile.release]
+ lto = "thin"
+ codegen-units = 1
________________________________________________________
Executed in  522.91 secs    fish           external
   usr time   47.94 mins  704.00 micros   47.94 mins
   sys time    1.43 mins  219.00 micros    1.43 mins

size: 159024032
   
With Fat LTO
+ [profile.release]
+ lto = "fat"
+ codegen-units = 1
________________________________________________________
Executed in   18.26 mins    fish           external
   usr time   44.87 mins    1.13 millis   44.87 mins
   sys time    1.53 mins    4.44 millis    1.53 mins
   
size: 150544072

I believe thin LTO should be good enough for perf, let's see if there's demand to improve things in the future and move accordingly?

As for PGO and LLVM-bolt optimizations, it might be too much and anyone interested would still be able to do this on their own when compiling themselves. Not sure if we should even guide them to do the same!

@de-sh de-sh linked a pull request Oct 22, 2024 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help needed community help is welcome and appreciated
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants