Skip to content

The objective of this analysis was to use machine learning models to accurately predict credit risk.

Notifications You must be signed in to change notification settings

Lsuantah/Credit_Risk_Analysis

Repository files navigation

Credit Risk Analysis

Overview of the analysis:

The objective of this project is use Machine learning techniques such as Resampling, SMOTEENN and Emsemble classifiers to analyzed LoanStats data and predict the best possible credit risk outcome.

Results:

Random Oversampling

  • Balanced Accuracy Score: 63.39%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 69%
  • Recall Score low risk: 61% randomoversampler

SMOTE Oversampling

  • Balanced Accuracy Score: 63.07%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 60%
  • Recall Score low risk: 66%

SMOTE

Cluster Centroids

  • Balanced Accuracy Score: 52.95%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 61%
  • Recall Score low risk: 45%

CLUSTERCENTROID

SMOTEENN

  • Balanced Accuracy Score: 63.76%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 70%
  • Recall Score low risk: 57%

SMOTEEN

Random Forest Classifier

  • Balanced Accuracy Score: 78.37%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 67%
  • Recall Score low risk: 89%

Randomforest

Easy Ensemble Classifier

  • Balanced Accuracy Score: 91.78%
  • Precision Score - High Risk: 0.01
  • Precision Score - Low Risk: 1.00
  • Recall Score high risk: 89%
  • Recall Score low risk: 94%

EasyEnsemple

Summary:

From all the machine learning model, Easy Ensemble Classifier produced the most accurate with the best predictions for loans at all risk levels. The second best is the Random Forest Classifier

  1. Easy Ensemble Classifier

Balanced Accuracy Score: 91.78%

Recall Score high risk: 89%

Recall Score low risk: 94%

  1. Random Forest Classifier

Balanced Accuracy Score: 78.37%

Recall Score high risk: 67%

Recall Score low risk: 89%

I would recommend Easy Ensemble Classifier to be used as it gives the most accurate prediction from all the models.

About

The objective of this analysis was to use machine learning models to accurately predict credit risk.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published