-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large binary sizes #115
Comments
I have investigated this using a mix of disassembly and binary analysis. MethodologyTools used:
I tested the rego-cpp implementation and my local fork of infix (mostly the same thing, but some code is added that may affect specific sizes of things). For how I used Ghidra, I imported each executable into CodeBrowser with default settings. Rego analysisDebug buildThe debug build is about 10% code according to Ghidra, with most sections being a combination of typeinfo and debug metadata. Release buildThe largest function in rego-cpp is the One symbol that doesn't look like a rule is called InfixThe Infix debug build is much the same as Rego's, but on a smaller scale. The first and second-largest functions don't decompile, but the 3rd largest one does. By elimination, the only code that isn't in a lambda (lambdas are not transitively called here, just initialized) is I have a theory that preventing inlining of that one constructor routine would cut binary size by 1-2 mb, at least on this version of my local compiler etc etc. ThoughtsI have learned two things from doing this:
|
Is you add |
Minor update: I disproved my theory that it was that FastPattern construction in particular. I added the |
I am wondering if it is some of my gratuitous use of T(Token... ts) Be some careful trick to remove the template and use a dynamic size. You would want to force the inlining of the initial template bit, to turn it into a vector, but not the vector bit. I can try to prototype what I mean, if that would be helpful. |
You're welcome to if you have time, but for me a big nagging question is "well what is that repeated inlined code doing then, if it's not that constructor?". I'll set aside some time soon (give or take checking out some of the big conference thing for the next 3 days), because at this point I just want to know. Maybe it will be something surprising that affects decision making... or not at all, who knows. |
Some off-channel discussion and experiments later, I've isolated what I think are key factors making the binaries big. It seems to be the implicitly-defined lifecycle functions (copy constructors, move constructors, destructors, etc) getting inlined. The tricky part is that all these implicit functions are not even in the source usually, so I went through and explicitly defaulted then marked noinline some likely suspects (as well as some operators that just call into stdlib to change a vector or map). Past this point, further noinline directives seem to have little effect (the last one I tried only cut 2kb total). A lot of the remaining code size is libraries used like re2 and CLI11. |
I tried the patched Trieste on Rego-cpp as well, and it saves 28% binary size, or barely under 2mb. Given a cursory look, the largest symbols in the shrunk binary are mostly lambdas and such with implementation code in them. |
Binary size improvements related to #115
To update this more accurately, I think the lifecycle code ( Looking at the largest rule lambdas in a recent size report rego_release_nx_report.zip, I can still see a lot of repeating inlined code, possibly It's not clear exactly how to deal with this, but if this issue gets passed onto someone else at some point, hopefully this means a bit less of it is all in my head. |
Trieste executables are currently quite large, for example, on Ubuntu in Debug:
Even in Release, they are larger than may be warranted (in particular, for
infix
):The text was updated successfully, but these errors were encountered: