Code and presentation for the oral exam in Machine Learning and Statistics I in January 2019, at the University of Manchester.
The data on which the project is based can be collected from https://www.kaggle.com/mlg-ulb/creditcardfraud Note that a smaller subsample of the results presented here were used to produce these results.
Three versions of a Logistic regression model were made:
- Model 1 (Simple): features selected based on correlation
- Model 2: feature selection based on correlation and undersampling to handle class imbalance
- Model 3: feature selection using RFE and Lasso, with undersampling to handle class imbalance