Skip to content

Latest commit

 

History

History
104 lines (57 loc) · 2.63 KB

README.md

File metadata and controls

104 lines (57 loc) · 2.63 KB

Credit Risk Analysis

Overview of the analysis:

The objective of this project is use Machine learning techniques such as Resampling, SMOTEENN and Emsemble classifiers to analyzed LoanStats data and predict the best possible credit risk outcome.

Results:

Random Oversampling

  • Balanced Accuracy Score: 63.39%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 69%
  • Recall Score low risk: 61% randomoversampler

SMOTE Oversampling

  • Balanced Accuracy Score: 63.07%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 60%
  • Recall Score low risk: 66%

SMOTE

Cluster Centroids

  • Balanced Accuracy Score: 52.95%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 61%
  • Recall Score low risk: 45%

CLUSTERCENTROID

SMOTEENN

  • Balanced Accuracy Score: 63.76%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 70%
  • Recall Score low risk: 57%

SMOTEEN

Random Forest Classifier

  • Balanced Accuracy Score: 78.37%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 67%
  • Recall Score low risk: 89%

Randomforest

Easy Ensemble Classifier

  • Balanced Accuracy Score: 91.78%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 89%
  • Recall Score low risk: 94%

EasyEnsemple

Summary:

From all the machine learning model, Easy Ensemble Classifier produced the most accurate with the best predictions for loans at all risk levels. The second best is the Random Forest Classifier

  1. Easy Ensemble Classifier

Balanced Accuracy Score: 91.78%

Recall Score high risk: 89%

Recall Score low risk: 94%

  1. Random Forest Classifier

Balanced Accuracy Score: 78.37%

Recall Score high risk: 67%

Recall Score low risk: 89%

I would recommend Easy Ensemble Classifier to be used as it gives the most accurate prediction from all the models.