This is a package of GraphSPD, a graph-based model to detect security patches with their code property graphs. More information can be found in our paper "GraphSPD: Graph-Based Security Patch Detection with Enriched Code Semantics", which will appear in the 44th IEEE Symposium on Security and Privacy (IEEE S&P 2023), San Francisco, CA, May 22-26, 2023.
If you are using GraphSPD for work that will result in a publication (thesis, dissertation, paper, article), please use the following citation.
@inproceedings{wang2022graphspd,
title = {GraphSPD: Graph-Based Security Patch Detection with Enriched Code Semantics},
author = {Shu Wang and Xinda Wang and Kun Sun and Sushil Jajodia and Haining Wang and Qi Li},
booktitle = {2023 IEEE Symposium on Security and Privacy (SP)},
year = {2023},
pages = {2409-2426},
doi = {10.1109/SP46215.2023.00035},
url = {https://doi.ieeecomputersociety.org/10.1109/SP46215.2023.00035},
publisher = {IEEE Computer Society},
address = {Los Alamitos, CA, USA},
month = {May}
}
or
Shu Wang, Xinda Wang, Kun Sun, Sushil Jajodia, Haining Wang, and Qi Li, "GraphSPD: Graph-Based Security Patch Detection with Enriched Code Semantics," 2023 44th IEEE Symposium on Security and Privacy (S&P), San Francisco, CA, US, 2023 pp. 2409-2426. doi: 10.1109/SP46215.2023.00035
We developed GraphSPD and released the source code, aiming to help admins and developers identify silent security patches and prioritize their development. We hold no liability for any undesirable consequences of using the software package.
GraphSPD is released under the Apache License 2.0 (refer to the LICENSE file for details)
The ./joern/
library is based on Jeorn, which is also under the Apache License 2.0.
Download Ubuntu 20.04.4 LTS (Focal Fossa) desktop version .iso
file and install the OS.
Note: To avoid potential version conflicts of Python and Java on existing OS, we suggest installing a new Ubuntu virtual machine on VMM (e.g., VirtualBox 5.2.24, VMware, etc.).
Suggested system configurations:
(Our package can run on pure CPU environments)
RAM: >2GB
Disk: >30GB
CPU: >1 core
Install git
tool if you have not intalled it.
sudo apt install git
Download the project folder into the user's HOME
directory.
cd ~
git clone https://github.com/SunLab-GMU/GraphSPD.git
cd ~/GraphSPD/
chmod +x install_dep.sh
./install_dep.sh
Note: All commands are executed under the main project folder:
~/GraphSPD/
.
We provide 10 patch as test examples in ~/GraphSPD/raw_patch/
.
To retrive their pre- and post-patch files, run the following commands under ~/GraphSPD/
:
python3 get_ab_file.py nginx nginx 02cca547
python3 get_ab_file.py nginx nginx 661e4086
python3 get_ab_file.py nginx nginx 9a3ec202
python3 get_ab_file.py nginx nginx dac90a4b
python3 get_ab_file.py nginx nginx fc785b12
python3 get_ab_file.py nginx nginx 60a8ed26
python3 get_ab_file.py nginx nginx bd7dad5b
python3 get_ab_file.py nginx nginx 4c5a49ce
python3 get_ab_file.py nginx nginx 71eb19da
python3 get_ab_file.py nginx nginx 56f53316
In the above, the third column refers to the owner (i.e., nginx
), the fourth column refers to the repository (i.e., nginx
), and the last column refers to the commit ID (e.g., 02cca547
).
As the result, the pre-patch and post-patch files will be stored in ~/GraphSPD/ab_file/
.
You can pre-process your own patches by running:
python3 get_ab_file.py [owner] [repository] [commitID]
where [owner]
, [repository]
, and [commitID]
are the owner name, repository name, and commit ID of the patch hosted on GitHub.
To generate PatchCPGs for all the patches processed by the last step, please run the following commands under ~/GraphSPD/
:
chmod -R +x ./joern
sudo python3 gen_cpg.py
python3 merge_cpg.py
Here, gen_cpg.py
will generate two CPGs for pre- and post-patch files, respectively.
merge_cpg.py
will generate a merged PatchCPG from the two CPGs.
The output PatchCPGs will be saved in ~/GraphSPD/testdata/
.
In ~/GraphSPD/
, run the command:
python3 test.py
The prediction results are saved in file ~/GraphSPD/logs/test_results.txt
.
See the results by running:
cat logs/test_results.txt
The prediction results contains the PatchCPG file path and the predictions, where 1 represents security patch and 0 represents non-security patch.
filename,prediction
./testdata/fc785b12/out_slim_ninf_noast_n1_w.log,0
./testdata/dac90a4b/out_slim_ninf_noast_n1_w.log,1
./testdata/60a8ed26/out_slim_ninf_noast_n1_w.log,1
./testdata/71eb19da/out_slim_ninf_noast_n1_w.log,0
./testdata/9a3ec202/out_slim_ninf_noast_n1_w.log,1
./testdata/bd7dad5b/out_slim_ninf_noast_n1_w.log,1
./testdata/661e4086/out_slim_ninf_noast_n1_w.log,1
./testdata/02cca547/out_slim_ninf_noast_n1_w.log,0
./testdata/4c5a49ce/out_slim_ninf_noast_n1_w.log,0
./testdata/56f53316/out_slim_ninf_noast_n1_w.log,0
[1] PatchDB: A Large-Scale Security Patch Dataset
[2] PatchRNN: A Deep Learning-Based System for Security Patch Identification
[3] Detecting "0-Day" Vulnerability: An Empirical Study of Secret Security Patch in OSS
The GraphSPD package is built by Sun Security Laboratory (SunLab) at George Mason University, Fairfax, VA.
Last Updated Date: Aug, 2022