-
-
Notifications
You must be signed in to change notification settings - Fork 4
Optimize Code Mode
Optimize Code Mode (OCM) is an experimental new mode added into ESBMC-AI that optimizes functions into smaller, more efficient forms. The mode attempts to optimize the function as best as possible without changing the layout of the code, so ideally, it is a drop-in optimizer for C/C++ functions. OCM relies on the ClangAST backend developed for interacting with source code in order to ensure that the optimized code is partially equivalent with the original source code. OCM can be summarized as the following steps:
- Initialize a ClangAST instance for the old source code.
- Get all functions of the old source code.
- Split the task into each function, so each function will be optimized separately.
- For each function to optimize:
- Optimize the function.
- Initialize a ClangAST instance for the new source code.
- Compare function signatures to ensure that they are equal. That is, the optimized new function should have the exact same signature as the old function signature.
- Perform partial equivalence checking to ensure that the original and optimized functions are partially equivalent.
- If equivalent, then perform step i for the next function but with the optimized source code, such that the source code gets optimized as each function is optimized. If not, then retry from step i for the same function without changing the original source code.
- Once all functions have been optimized, the whole source code should be optimized, the results are returned.
The partial equivalence checking process uses ESBMC as the backend, along with a function equivalence script. The function equivalence script is responsible for programming ESBMC to check for the partial equivalence of the original and optimized functions. The term “partial equivalence” is used because it refers to the fact that checking for full equivalence in the input/output space is undecidable with certain programs (loops). The equivalence script is populated by the old and new source code (original source code and optimized source code, respectively). In order to not have any identifier and type clashes in the code, any declarations of types and identifiers in the two source code files are renamed, using the ClangAST backend. The terms "_old" and "_new" appended at the end. This change makes the code able to be included in a single C source code file that compiles.
The following fields are inserted into the equivalence script in order to check the partial equivalence of the optimized function:
-
function_old
: The original source code is inserted. -
function_new
: The new optimized source code is inserted. -
parameters_list
: Using ClangAST, the parameters for the functions are initialized and placed here. The same__VERIFIER_nondet_X
is used for both function by assigning it to a common variable first. -
function_call_old
andfunction_call_new
: The old function and new function invocation code is inserted here, respectively. The variables fromparameter_list
are also inserted. The results from the functions are inserted into a new variable. -
function_assert_old
andfunction_assert_new
: The results from the functions are compared in this assert statement. Due to the nature of__VERIFIER_nondet_X
, the entire input space of the function is explored, this means that if an input doesn't match an output, ESBMC will not successfully verify the source code.
The mode can be configured with the following options.
array_expansion
defines the continuous memory that arrays and pointers should be initialized to. The current system is very limited, in the future, work needs to be done to resolve the limitations of function parameter pointers being only continuous memory.
init_max_depth
defines the max depth that structs will be initialized into. After a pointer is encountered beyond depth, a NULL
value will be assigned.
partial_equivalence_check
defines the mode that the resulting value of the optimized and original result are checked. The following options are available:
-
basic
: Performs a basic equality check of the original and optimized result –assert(result_old == result_new);
. This method is basic in its nature, however, it is the fastest method. -
deep
: Will traverse the original and optimized result depth first and perform equality checks. This ensures that the entire struct is checked. When a pointer is encountered, it will dereference it and explore its elements. If it is an array, then the array elements will be explored depth first as well. The depth of initialization and width of array exploration are defined in the config by the valuesinit_max_depth
andarray_expansion
respectively.
TBD
Currently, the following C/C++ language features are not supported by Optimize Code Mode. The list might be incomplete.
- Typedefs: They are not renamed.
- No C++ Features supported: Classes, Templates, etc.
ESBMC-AI made by Yiannis Charalambous
• https://yiannis-charalambous.com • https://github.com/Yiannis128 •