Skip to content

pwr-pbr23/M7

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

M7

PBR23M7

Group members: Nikodem Kropielnicki, Bogusława Tlołka, Joanna Wdziękońska

Project based on VulCurator: A Vulnerability-Fixing Commit Detector

Missing files

Overleaf

Team Policies and Team Expectations Agreement

GoogleColab

Progress is tracked in Github Projects

Reproduction of results:

You can do it manualy (instruction below) or via GoogleColab (by running commands in sequence) Results can be found after training each classifier as the F1 score presented in epoch 19 (for message, patch and issue) and as the F1 score presented at the end for ensemble.

Note: Some files were too big to put them on Github, they are in the missing files link above (on GoogleColab there is a section which uploads them to project)

For Tensorflow dataset:

To train message classifier:

python message_classifier.py --dataset_path tf_vuln_dataset.csv --model_path model/tf_message_classifier.sav

To train issue classifier:

python issue_classifier.py --dataset_path tf_vuln_dataset.csv --model_path model/tf_issue_classifier.sav

To finetune CodeBERT for patch classifier:

python vulfixminer_finetune.py --dataset_path tf_vuln_dataset.csv --finetune_model_path model/tf_patch_vulfixminer_finetuned_model.sav

To train patch classifier:

python vulfixminer.py --dataset_path tf_vuln_dataset.csv --model_path model/tf_patch_vulfixminer.sav --finetune_model_path model/tf_patch_vulfixminer_finetuned_model.sav --train_prob_path probs/tf_patch_vulfixminer_train_prob.txt --test_prob_path probs/tf_patch_vulfixminer_test_prob.txt

To run ensemble classifier:

python variant_ensemble.py --config_file tf_dataset.conf

Similarly, for SAP dataset (some classifiers about 1,5h on GoogleColab!):

To train message classifier:

python message_classifier.py --dataset_path sub_enhanced_dataset_th_100.txt --model_path model/sap_message_classifier.sav

To train issue classifier:

python issue_classifier.py --dataset_path sub_enhanced_dataset_th_100.txt --model_path model/sap_issue_classifier.sav

To finetune CodeBERT for patch classifier: python vulfixminer_finetune.py --dataset_path sap_patch_dataset.csv --finetune_model_path model/sap_patch_vulfixminer_finetuned_model.sav

To train patch classifier:

python vulfixminer.py --dataset_path sap_patch_dataset.csv --model_path model/sap_patch_vulfixminer.sav --finetune_model_path model/sap_patch_vulfixminer_finetuned_model.sav --train_prob_path probs/sap_patch_vulfixminer_train_prob.txt --test_prob_path probs/sap_patch_vulfixminer_test_prob.txt

To run ensemble classifier:

python variant_ensemble.py --config_file sap_dataset.conf

For MSR dataset:

To train message classifier:

python message_classifier.py --dataset_path partycje.json --model_path model/msr_message_classifier.sav

To train issue classifier:

python issue_classifier.py --dataset_path partycje.json --model_path model/msr_issue_classifier.sav

To finetune CodeBERT for patch classifier: python vulfixminer_finetune.py --dataset_path partycje.json --finetune_model_path model/msr_patch_vulfixminer_finetuned_model.sav

To train patch classifier:

python vulfixminer.py --dataset_path partycje.json --model_path model/msr_patch_vulfixminer.sav --finetune_model_path model/msr_patch_vulfixminer_finetuned_model.sav --train_prob_path probs/msr_patch_vulfixminer_train_prob.txt --test_prob_path probs/msr_patch_vulfixminer_test_prob.txt

To run ensemble classifier:

python variant_ensemble.py --config_file msr_dataset.conf

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages