Estimating the probabiliy of financial distress for borrowers using different models in Machine Learning

This was a free assignment in an introductory course in Machine Learning. We chose the competition Give me some Credit on kaggle.com.

We built three estimators with sklearn:

Logistic Regression
Support Vector Machine
Random Forest

We tested different methods of data preprocessing by filling nan-values with the columns means or throwing them out. Also outliers were manually eliminated by checking against multiples of the standard deviation. Due to an imbalanced dataset, we tried to maintain am evened out set by either Oversampling or Undersampling respectively.

Also, in regard to the data imbalance we found the ROC-AUC score to be more descriptive than the classic accuracy.

The different steps of preprocessing and the three models can be found in the seperate ipython notenooks.

As of writing this, the code runs with the following versions:

Python 2.7.10
ipython notebook 3.2.1
SciKitLearn 0.16.1
Pandas 0.16.2

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
data		data
doc		doc
excercises		excercises
Data_Cropping.ipynb		Data_Cropping.ipynb
Oversampling.ipynb		Oversampling.ipynb
README.md		README.md
Undersampling.ipynb		Undersampling.ipynb
analysis.ipynb		analysis.ipynb
data_cleaning.ipynb		data_cleaning.ipynb
log_reg.ipynb		log_reg.ipynb
print_scores.py		print_scores.py
print_scores.pyc		print_scores.pyc
random_forest.ipynb		random_forest.ipynb
refined_analysis.ipynb		refined_analysis.ipynb
support_vector_machine.ipynb		support_vector_machine.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Estimating the probabiliy of financial distress for borrowers using different models in Machine Learning

About

Releases

Packages

Contributors 2

Languages

roechi/machine_learning_assignment

Folders and files

Latest commit

History

Repository files navigation

Estimating the probabiliy of financial distress for borrowers using different models in Machine Learning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages