Skip to content

Commit

Permalink
added a virtual environment to make running simpler; README contains …
Browse files Browse the repository at this point in the history
…instructions on running
  • Loading branch information
Roni Lazimi authored and Roni Lazimi committed Apr 25, 2021
1 parent b93eeea commit 158b70d
Show file tree
Hide file tree
Showing 4 changed files with 97 additions and 1 deletion.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
archive
.vscode
venv
16 changes: 15 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,20 @@
## Goal
build a classification model to determine whether or not a loan applicant will receive a loan given some data about the loan and some objective data about an individual

## Running This Notebook
### Steps
```
python3 -m venv ./venv
source venv/bin/activate
pip install -r requirements.txt
# open this notebook in a Jupyter processor
```
### Links
- download [python3.9.4](https://www.python.org/downloads/release/python-394/)
- download [VSCode and its Jupyter processing extension](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter)


## Presenting
- a 6 minute live presentation will be given to the class

Expand All @@ -22,7 +36,7 @@ build a classification model to determine whether or not a loan applicant will r
## Implementation
- clean data
- transform our target dimension into a binary number (simplifying from multiple values to a single value)
- for each dimension that has multiple possible string values, we will create a dimension; across each of these new dimensions, only one dimension will have a value of 1, and the rest will have 0's
- for each dimension that has multiple possible string values, we will create a dimension using one-hot encoding
- create a class that represents a test
- each test contains a training algorithm and arguments to run the training algorithm
- results of run will be presented as a single-row numpy array (that can be hstacked onto the other results)
Expand Down
51 changes: 51 additions & 0 deletions model-selection.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
{
"metadata": {
"orig_nbformat": 2,
"kernelspec": {
"name": "python394jvsc74a57bd0daf19b0280700678d1b30283c052a0b4f397f6934ace6c653c7a54482fc82c11",
"display_name": "Python 3.9.4 64-bit ('venv')"
}
},
"nbformat": 4,
"nbformat_minor": 2,
"cells": [
{
"source": [
"# Imports"
],
"cell_type": "markdown",
"metadata": {}
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import sklearn as skl"
]
},
{
"source": [
"# Data Cleaning"
],
"cell_type": "markdown",
"metadata": {}
},
{
"source": [
"## Selecting Relevant Fields"
],
"cell_type": "markdown",
"metadata": {}
},
{
"source": [
"## One-Hot Encoding"
],
"cell_type": "markdown",
"metadata": {}
}
]
}
29 changes: 29 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
appnope==0.1.2
backcall==0.2.0
decorator==5.0.7
ipykernel==5.5.3
ipython==7.22.0
ipython-genutils==0.2.0
jedi==0.18.0
joblib==1.0.1
jupyter-client==6.1.12
jupyter-core==4.7.1
numpy==1.20.2
pandas==1.2.4
parso==0.8.2
pexpect==4.8.0
pickleshare==0.7.5
prompt-toolkit==3.0.18
ptyprocess==0.7.0
Pygments==2.8.1
python-dateutil==2.8.1
pytz==2021.1
pyzmq==22.0.3
scikit-learn==0.24.1
scipy==1.6.2
six==1.15.0
sklearn==0.0
threadpoolctl==2.1.0
tornado==6.1
traitlets==5.0.5
wcwidth==0.2.5

0 comments on commit 158b70d

Please sign in to comment.