-
Notifications
You must be signed in to change notification settings - Fork 0
Optimizations
CFLAGS control the behavior of the compiler during compilation of C code. Everything set in CFLAGS will be carried over to CXXFLAGS.
Disable code flow integrity protection. CFI is a set of features which are designed to abort the program upon detecting certain forms of undefined behavior that can potentially allow attackers to subvert the program’s control flow. This causes a slight performance overhead and increases code size.
Create sections for data and allow the linker to mark sections as not needed. Code marked as unnecessary data can be removed with -Wl,--gc-sections
which can drastically reduce code size.
WARNING: Clang specfic flag! Don't use it with GCC!
Use direct access relocations instead of GOT to reference external data symbols. This is similar behavior to static code. Gentoo compiles shared PIC/PIE code by default. This flag helps minimize GOT/PLT context switching as much as possible.
Break C standards in favor of ricing. It allows the compiler to use math shortcuts at the expense of accuracy, allowing code to execute less calculations resulting in faster execution. Disable it for packages that require accurate math, such as many of the packages in the sci-libs category.
Yes, -Ofast
implies -O3 -ffast-math
, but you get more flexibility by using -ffast-math
since you can define it with any optimization level, such as -O2
.
WARNING: Clang specfic flag! Don't use it with GCC!
Emit more virtual tables to improve devirtualization.
Form fused floating-point operations. Clang's default is set to on for most code, fast for CUDA code, and fast-honor-pragmas for HIP code. This changes Clang's -ffp-contract
behavior to fast which is the default behavior in GCC. This option is turned on with -ffast-math
, but having it always enabled will allow it to be used when -ffast-math
causes failures.
Creates sections for functions and allow the linker to mark sections as not needed. Code marked as unnecessary functions can be removed with
-Wl,--gc-sections
which can drastically reduce code size.
Optimize code at link-time rather than compile time. The benefit is the compiler can see everything at once and then make the best optimizations for the whole program. Theoretically, this builds smaller, more optimized programs but that is not always the case. Some programs end up slower or incorrectly compiled with LTO.
There are two methods for LTO: full and thin. Thin has better memory efficiency and allows the LTO phase to run in parallel for faster compilation. The downside is you sacrifice program visibility and flags like -fno-semantic-interposition
are more prone to failure with -flto=thin
. Full mode is slower since it isn't parallel. It also requires more memory but has better visibility since code isn't being cut into pieces for parallel
threading and merged back together. For that reason it works better with -fno-semantic-interposition
and you can only use -fvirtual-function-elimination
with full mode. If you can manage the extra memory required and don't mind a little slower compilation, choose full for the extra visibility.
In order to take full advantage of devirtualization, it's recommended to use link-time optimization.
Variables without initializers won't have common linkage. Common linkage implies a speed and size penalty, and is currently deprecated. It's harmless to use so it's defined here just in case.
Use GOT indirection instead of PLT to make external function calls. This leads to more efficient code by eliminating PLT stubs and exposing GOT loads to optimizations.
Sanitizers cause unwanted memory and CPU overhead. It's not possible to turn all sanitizers on at once, but they can be disabled all at once with this mighty flag.
For shared code ELF allows interposing of symbols by the dynamic linker. This means for symbols exported from the DSO, the compiler cannot perform inter-procedural propagation, in-lining and other optimizations. This returns some of the performance stolen by PIC/PIE.
Stack smashing protection helps the compiler detect stack buffer overflows. The extra checks cause extra overhead so off with their heads.
WARNING: Clang specfic flag! Don't use it with GCC!
Remove dead virtual functions from vtables so that CGProfile metadata gets cleaned up correctly. It can only be used with full LTO because it needs to see every call to llvm.type.checked.load in the linkage unit, which ThinLTO doesn't support currently.
This requires -fwhole-program-vtables
to function, which also requires -flto
.
WARNING: Clang specfic flag! Don't use it with GCC!
Enable whole-program vtable optimizations for classes with hidden LTO visibility.
This flag requires -flto
.
FORTIFY_SOURCE
provides light weight compile and runtime protection to some memory and string functions. It's supposed to have little to no runtime overhead and can be enabled for all applications and libraries in an OS. -D_FORTIFY_SOURCE
is the default option. If the extra overhead is undesirable use this flag at the cost of some security.
WARNING: Clang specfic flag! Don't use it with GCC!
Optimizes based on the strict rules for overwriting polymorphic C++ and other object oriented languages.
Make inlines hidden during compile time. When paired with -flto
hidden inlines become visible during link-time for better optimization.
Bind default visibility defined symbols (or functions) locally for shared code. Use -Wl,-Bsymbolic-non-weak-functions
when this causes issues.
WARNING: LLD specfic flag! Don't use it with BFD!
Use zlib to compress the final code output. There are 2 useful levels: level 1 and level 2. Level 0 obviously disables size optimization. Level 1 is fastest compression and level 2 is high compression equal to zlib level 6.
Changes the default linker behavior from lazy to eager binding. This makes the code resolve all symbols at load.
Force relocation read-only. Define it here in case some builds try to override it.
Sets DT_NEEDED for shared libraries. If libraries aren't needed during link-time, the linker skips them saving code size and unnecessary executions.
Collect garbage during link-time, removing unused symbols that can bloat the code. This helps keep code size smaller and more memory efficient.
WARNING: LLD specfic flag! Don't use it with BFD!
Fold identical code during link-time. This helps keep code size smaller and memory efficient. There are three levels: none, safe, and all. If all causes failures, try safe, and then try disabling the flag.
WARNING: LLD specfic flag! Don't use it with BFD!
Sets the linker optimization pipeline level during link-time. There are 4 levels: level 0, level 1, level 2, and level 3. Level 3 is the maximum level, you can't rice beyond it. This option adds more passes and makes some passes more aggressive.