PreCA performs precondition inference relying on constraint acquisition. PreCA does not need the source code of the function under analysis but only the binary. PRECA is part of the BINSEC toolbox for binary-level program analysis and is build on top of the constraint acquisition plateform.
- Requirements
- Installation
- Usage
- Experiments
- Run your own example
- VM artifact (OUTDATED)
- Cite us
- References
The PreCA framework depends on:
- The Java Runtime Environment
- Maven if you want to recompile PreCA
- Binsec (>= 0.8.1) with Unisim*
- Python 3 (to run the experiments)
*We recommend to install Binsec through opam, by following these steps:
cd path/to/preca
opam switch create . <OCAML_VERSION> # OCAML_VERSION < 5.0.0
opam pin ./binsecplugin/concrete/ # pin the libconcrete package and install its dependencies (Binsec, Unisim ...)
To install the python dependencies, run:
python3 -m venv path/to/venv
source path/to/venv/bin/activate
pip install -r requirements.txt
To use PreCA the PRECA_PATH
environment variable should be set:
# You can put the following line in your .bashrc file (or equivalent file)
export PRECA_PATH=/path/to/preca/directory
The preca.jar
file is available to use PreCA without compiling it. See the usage section to found out how to use it.
You can compile the code using maven (the generated jar file is then in the target
directory):
mvn clean compile assembly:single
cp target/<jarfile> preca.jar
You can get PreCA's available options:
java -jar preca.jar -help
PreCA takes as input the path to the configuration file as follows:
java -jar preca.jar -file <config>
Such a configuration file enables to specify which binary and which function is analyzed. You can get the documentation for the configuration file format with:
java -jar preca.jar -helpconf
To check that your install works correctly, you can run PreCA over the given strcat example:
java -ea -jar preca.jar -file ./examples/strcat_conacq.txt
The output should end with:
*************Learned Network CL example ******
network var=4 cst=3
-------------------------
CONSTRAINTS:
Valid(var0)
Valid(var1)
NotOverlap(var0, var1)_or_StrlenEq0(var1)[0, 1, 2, 3]
-------------------------
*************Learned Network CL example (SMTLIB) ******
(and (valid v0) (valid v1) (or (not (overlap v0 v1)) (strleneq v1 #x00000000)))
We provide the needed datasets in the datasets
directory and the script scripts/bench.py
to replicate the results from our IJCAI'22 and KR'23 papers.
The help is available through
python3 ./scripts/bench.py -h
Moreover, after running your experiments, you can always recompute the statistics as follows:
python3 ./scripts/recompute_stats.py --file <json-file> --timeout <seconds>
Finally, since our IJCAI'22 paper, we added new constraint to our constraint language. Still, as both our IJCAI'22 and KR'23 papers consider the original set of constraint, it is possible retrict to it through the --ijcai22
option.
To replicate experiments from our IJCAI'22 paper [1] run the following commands:
python3 ./scripts/bench.py --dataset ./datasets/ijcai22/nopost --timeout 3600 --emulto 5 --out <out json> --disj auto --biaslvl max --ijcai22
python3 ./scripts/bench.py --dataset ./datasets/ijcai22/post --timeout 3600 --emulto 5 --out <out json> --disj auto --biaslvl max --ijcai22
To replicate results from our KR'23 paper [2], run the following commands:
python3 ./scripts/bench.py --dataset ./datasets/kr2023 --timeout 3600 --emulto 5 --out <out json> --disj <disj> --biaslvl <lvl> --ijcai22
The <disj>
option states which methods is used to handle disjunction (see the following table).
The <lvl>
states the size of the bias considered and takes value among min
, avg
and max
.
Abbreviation | Description |
---|---|
auto | We rely on the heuristics from [1] to generate the considered disjunctions |
mss | we use DCA [2] to handle disjunctions |
n : int | add disjunctions of size up to n |
{ i1, ..., in } | add disjunctions of size i1, ..., in |
To run your own example, you must create a config file and a Binsec script associated to the function you want to analyze.
You can take example to the given examples/strcat_conacq.txt
and examples/strcat_dca.txt
configuration files,
as well as their corresponding Binsec script (datasets/binsec_ini/ijcai22/nopost/strcat.ini
).
For more examples, check the datasets
directory.
The PreCA artifact for the paper [1] was first published as a a virtual machine (~2.6G), which is available here. It contains the PreCA jar file, the datasets, and scripts to exercise PreCA and replay major experiments. The user and root password is "password".
@inproceedings{DBLP:conf/ijcai/MenguyBLG22,
author = {Gr{\'{e}}goire Menguy and
S{\'{e}}bastien Bardin and
Nadjib Lazaar and
Arnaud Gotlieb},
editor = {Luc De Raedt},
title = {Automated Program Analysis: Revisiting Precondition Inference through
Constraint Acquisition},
booktitle = {Proceedings of the Thirty-First International Joint Conference on
Artificial Intelligence, {IJCAI} 2022, Vienna, Austria, 23-29 July
2022},
pages = {1873--1879},
publisher = {ijcai.org},
year = {2022},
url = {https://doi.org/10.24963/ijcai.2022/260},
doi = {10.24963/ijcai.2022/260},
timestamp = {Wed, 27 Jul 2022 16:43:00 +0200},
biburl = {https://dblp.org/rec/conf/ijcai/MenguyBLG22.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}