Click on this link to get Sentry: https://l1nk.dev/T88sq
Considering the current limitation of the existing vulnerability prediction systems, our project will have the goal to develop an engineered ML pipeline for training, validating, and exporting a vulnerability prediction model which might potentially be employed within DevOps, with a particular focus on the usability, accessibility and quality of the product.
Following a possible version of configuration file:
---
repo: https://path/to/Github/repository
dataset: "path/to/dataset.csv"
configurations:
- 0:
Feature Scaling: zscore
Data Balancing: smote
Classifier: svm
Validation: ttsplit
Explanation Method: permutation
- 1:
Feature Scaling: minmax
Feature Selection: kbest
K: 9
Data Balancing: smote
Classifier: randomforest
Validation: ttsplit
Explanation Method: permutation
- 2:
Feature Scaling: minmax
Feature Selection: pearsoncorrelation
Data Balancing: oversampling
Classifier: randomforest
Validation: kfold
Explanation Method: confusionmatrix
statistical test:
- 0: 1, 0
- 1: 1, 2
The possible values for each node are:
- Data Cleaning (not exclusive)
- dataimputation
- shuffling
- duplicatesremoval
- Feature Scaling
- zscore
- minmax
- Feature Selection
- kbest (require K parameter)
- K
- variancethreshold
- pearsoncorrelation
- kbest (require K parameter)
- Data Balancing
- smote
- nearmiss
- undersampling
- oversampling
- Classifier
- svm
- randomforest
- kneighbors
- Validation
- ttsplit
- kfold
- stratifiedfold
- Explanation Method (not exclusive)
- confusionmatrix
- permutation
- partialdependence