Broken Promises: Measuring Confounding Effects in Learning-based Vulnerability Discovery

This repository contains the source code for our paper "Broken Promises: Measuring Confounding Effects in Learning-based Vulnerability Discovery" that was accepted at AISec '23.

Repository Structure

Experiments regarding the Causal Graph Model reside in CGIN, while experiments using the StackLSTM are in StackLSTM. Experiments using CodeT5+ and LineVul are in LLM. The directory Perturbations contains scripts to apply obfuscation and styling to obtain the perturbed training data. The experiments using the graph-based model ReVeal were performed using this repository. We used and modified the original code from both LineVul and CodeT5 for finetuning all our LLM models.

Requirements

For LLM:

Python
PyTorch
transformers
datasets
scikit-learn
numpy
matplotlib
tqdm
tree_sitter
sacrebleu==1.2.11

For CGIN:

Python
PyTorch
PyTorch Geometric
torch_scatter
numpy
networkx
scikit-learn
tqdm
gensim

For StackLSTM:

Python
PyTorch
tqdm
sctokenizer
scikit-learn
pickle
torchray
stacknn

For Perturbations / for generating the perturbed dataset:

Download the file from here and move it to the folder Perturbations
Download the file from here and also move it to the folder Perturbations

How to obtain support

Create an issue in this repository if you find a bug or have questions about the content.

For additional support, ask a question in SAP Community.

Contributing

If you wish to contribute code, offer fixes or improvements, please send a pull request. Due to legal reasons, contributors will be asked to accept a DCO when they create the first pull request to this project. This happens in an automated fashion during the submission process. SAP uses the standard DCO text of the Linux Foundation.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.reuse		.reuse
CGIN		CGIN
LICENSES		LICENSES
LLM		LLM
Perturbations		Perturbations
StackLSTM		StackLSTM
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Broken Promises: Measuring Confounding Effects in Learning-based Vulnerability Discovery

Repository Structure

Requirements

How to obtain support

Contributing

License

About

Releases

Packages

Contributors 2

Languages

License

SAP-samples/security-research-confounding-effects

Folders and files

Latest commit

History

Repository files navigation

Broken Promises: Measuring Confounding Effects in Learning-based Vulnerability Discovery

Repository Structure

Requirements

How to obtain support

Contributing

License

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages