Skip to content

Latest commit

 

History

History
53 lines (38 loc) · 3.71 KB

README.md

File metadata and controls

53 lines (38 loc) · 3.71 KB

Course-2

The task

Hello,

Over the past year or so Credit One has seen an increase in the number of customers who have defaulted on loans they have secured from various partners, and Credit One, as their credit scoring service, could risk losing business if the problem is not solved right away. The bottom line is they need a much better way to understand how much credit to allow someone to use or, at the very least, if someone should be approved or not. They have enlisted the help of our Data Science team to design and implement a creative, empirically sound solution. It is very important that we all understand from the start that this is not a typical data analytics problem as we have been given full authority to solve this problem with whatever tools and methods we need. As such we've elected to use Python and a few different libraries to do the heavy lifting for us. We'll be using a few main libraries for the bulk of our work, but you should not limit yourselves to only using those - feel free to investigate other libraries if you think they will contribute to the best solution.

Our first need is to define the problem within a data science framework and understand the differences between what we been doing with data analytics and what we're going to be doing in this project with data science. Then, you'll use your local programming environment to do your work without needing to be in one space or another to have access to the tools you need, before finally starting the analysis and soling this problem. I have attached some historical data that you'll be using for this task so you may use it to focus on understating the problem and getting your environment ready for the task now. I'll be expecting a report on your experience and understanding of the problem in a few days.

Thanks,

GR

Second part of the task

Hello,

Now that you have established your process, you're ready to begin your work by preparing and exploring the data. Before we dive in lets review some notes about the project:

Problem: Increase in customer default rates - This is bad for Credit One since we approve the customers for loans in the first place. Revenue and customer loss for clients and, eventually, loss of clients for Credit One Investigative Questions:

How do you ensure that customers can/will pay their loans? Can we do this? As you progress through the tasks at hand begin thinking about how to solve this problem. Here are some lessons we learned form a similar problem we addressed last year:

We cannot control customer spending habits We cannot always go from what we find in our analysis to the underlying "why" We must on the problem(s) we can solve: What attributes in the data can we deem to be statistically significant to the problem at hand? What concrete information can we derive from the data we have? What proven methods can we use to uncover more information and why? I'll be expecting a report on your experience in a few days.

Thanks,

GR

Skills Used for Course 2

Applying Python to Data Science problems. Preprocessing data (e.g., Feature Engineering, addressing missing data). Using data mining tools and different classifiers to develop predictive models. Using the Numpy package for scientific computing with Python. Applying machine learning techniques to classification problems. Using the pandas open source library for Python. Optimizing classifiers by adjusting and testing classifier parameters. Using the Matplotlib Python 2D plotting library. Applying cross-validation methods. Assessing the predictive performance of classifiers by examining key error metrics. Using the Sci-Kit Learn machine learning library for Python. Comparing and selecting different predictive models. Applying predictive models to test sets.