Predicting probability of defaul

Workflow

Model training
Generate predictions
Deploy the model to AWS using SageMaker

Project Structure

|-- data
    |-- submission.csv
|-- sagemaker
     |-- data
     |-- models
     |-- AWS - SageMaker.ipynb
     |-- featurizer_local.py
     |-- featurizer_remote.py
|-- src
    |-- models
        |-- baseline_model.py: Class for training the binary classification model
        |-- transformers.py: ColumnSelector transformer for preprocessing
|-- api_request_exmpl.json
|-- pickled_model.pickle
|-- Model.ipunb: model evaluation notebook

Pre-requisites

python3.8
aws-cli

Submissions

A verbose explanation of the model training and validation could be found here.
CSV file with resulting predictions could be found here.

Steps

Model training (`/Model.ipynb`)

Read and explore dataset (excluded from the current repo due to privacy reasons)
Train baseline defualt prediction model
Evaluate and сompare different models (Logistic Regression and Tree-based ensemble models) with a set of metrics
Estimate feature inportance
Generate predictions (/data/submission.csv)
Dump the best estimator (/pickled_model.pickle)

Model deployment (`sagemaker/AWS - SageMaker.ipynb`)

Define Sagemaker session and role
Preprocessing data and train the model
Create SageMaker Scikit Estimator
Batch transform training data
Fit a Tree-based Model with the preprocessed data
Serial Inference Pipeline with Scikit preprocessor and classifier
Deploy model
Make a request to the pipeline endpoint

API Gateway endpoint exposing

Besides AWS SageMaker I tested some other AWS deployment options including AWS Serverless Application Model (SAM). It allows deploing ML models with Serverless API (AWS Lambda). I managed to expose API endpoint with helloworld application behind in order to try AWS Lambda.

Used technologies

ECR: Container & Registy
AWS Lambda: Serving API
SAM: Serverless Framework

Query the endpoint

GET request:

https://x3jp27x3t3.execute-api.eu-central-1.amazonaws.com/test/hello

Expected response:

{
    "message": "hello world"
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
sagemaker		sagemaker
src/models		src/models
Model.ipynb		Model.ipynb
README.md		README.md
api_request_exmpl.json		api_request_exmpl.json
model_explanation.md		model_explanation.md
pickled_model.pickle		pickled_model.pickle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting probability of defaul

Workflow

Project Structure

Pre-requisites

Submissions

Steps

Model training (`/Model.ipynb`)

Model deployment (`sagemaker/AWS - SageMaker.ipynb`)

API Gateway endpoint exposing

Used technologies

Query the endpoint

About

Releases

Packages

Languages

Ksyula/default_prediction

Folders and files

Latest commit

History

Repository files navigation

Predicting probability of defaul

Workflow

Project Structure

Pre-requisites

Submissions

Steps

Model training (/Model.ipynb)

Model deployment (sagemaker/AWS - SageMaker.ipynb)

API Gateway endpoint exposing

Used technologies

Query the endpoint

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Model training (`/Model.ipynb`)

Model deployment (`sagemaker/AWS - SageMaker.ipynb`)

Packages