The objective of this project is use Machine learning techniques such as Resampling, SMOTEENN and Emsemble classifiers to analyzed LoanStats data and predict the best possible credit risk outcome.
- Balanced Accuracy Score: 63.39%
- Precision Score - High Risk: 0.01
- Precision Score - Low Risk: 1.00
- Recall Score high risk: 69%
- Recall Score low risk: 61%
- Balanced Accuracy Score: 63.07%
- Precision Score - High Risk: 0.01
- Precision Score - Low Risk: 1.00
- Recall Score high risk: 60%
- Recall Score low risk: 66%
- Balanced Accuracy Score: 52.95%
- Precision Score - High Risk: 0.01
- Precision Score - Low Risk: 1.00
- Recall Score high risk: 61%
- Recall Score low risk: 45%
- Balanced Accuracy Score: 63.76%
- Precision Score - High Risk: 0.01
- Precision Score - Low Risk: 1.00
- Recall Score high risk: 70%
- Recall Score low risk: 57%
- Balanced Accuracy Score: 78.37%
- Precision Score - High Risk: 0.01
- Precision Score - Low Risk: 1.00
- Recall Score high risk: 67%
- Recall Score low risk: 89%
- Balanced Accuracy Score: 91.78%
- Precision Score - High Risk: 0.01
- Precision Score - Low Risk: 1.00
- Recall Score high risk: 89%
- Recall Score low risk: 94%
From all the machine learning model, Easy Ensemble Classifier produced the most accurate with the best predictions for loans at all risk levels. The second best is the Random Forest Classifier
- Easy Ensemble Classifier
Balanced Accuracy Score: 91.78%
Recall Score high risk: 89%
Recall Score low risk: 94%
- Random Forest Classifier
Balanced Accuracy Score: 78.37%
Recall Score high risk: 67%
Recall Score low risk: 89%
I would recommend Easy Ensemble Classifier to be used as it gives the most accurate prediction from all the models.