This project is a binary classification problem. The aim of this project is to predict the customers that we will lose. Data consists of 51,047 instances and 58 attributes. If we focus on company goals, actually our problem is losing customers. In this situation the company needs to find churned customer in other words the customers that we will lose. Thus, the company will avoid losing profit and revenue.
This is open source data (Cell2Cell Company) by Teradata center for customer relationship management at Duke University.
There are some general library requirements for the project and they are listed below.
- Numpy
- Pandas
- Scikit-learn
- Matplotlib
- Seaborn
- Machine Learning Librarys
The library requirements specific to some methods are:
- DecisionTreeClassifier
- RandomForestClassifier
- GradientBoostingClassifier
- XGBClassifier
- LGBMClassifier
- CatBoostClassifier
- Data Understanding
- Exploratory Data Analysis (EDA)
- Train Test Validation Split
- Outlier Handling
- Missing Value Handling
- Feature Generation
- Encoding
- Normalization
- Train Test Validation(X,Y)Split
- Feature Importance
- Create Model
Team Members |
---|
Sezen Duygu CEREN |
Furkan KARAKUZ |
Dila YAPICI |