Release Notes

This repository contains the Machine Learning model that is used by TipTracker for Analytics feature.

Release Notes

v1.0.1 (03/12/2021)

A script was added to generate datasets.

Performance metrics for various Machine Learning algorithms have been compiled for comparisons.

The Machine Learning algorithm was developed to make predictions for a given dataset.

v1.0.2 (04/02/2021)

A Flask Application was created to access the Machine Learning algorithm to make predictions.

MongoDB has been integrated to load the dataset for users.

Rest API has been developed to allow requesting shift predictions for a given user.

An access token has been added on the API routes to prevent unauthorized access to the resources.

v.1.0.3 (04/23/2021)

CI/CD for the application has been configured to automatically deploy new changes on the source code.

Bug Fixes

Validations have been added to the dataset before making predictions.

API routes return meaningful error messages.

Install Guide

Pre-prequisites

The following tools, software, and technologies are needed to run the application:

Python

For the project, we are using Python 3.8+. To check if you already have Python installed, run the following command:
```
$ python --version
```
If successful, it will display the currently installed Python version (eg: Python 3.8.1).

If you need to install Python, visit here.
MongoDB

We are using MongoDB to store data for our Machine Learning model.

If you do not have a MongoDB database, MongoDB provides a free cloud database service here. This is hosted remotely, so no additional download is required.

Download Instructions

The source code can be downloaded any one of the steps:

Cloning the repository

To clone the repository using HTTPS, run the following command from the desired directory:
```
$ git clone https://github.com/JIA-0302/Analytics.git
```
This will download all the source code from the repository.
Download the ZIP file from here

Installing Dependent Libraries

Setup virtual environment

Run the following commands if the virtual environment has not been setup:
```
$ pip install virtualenv
$ virtualenv venv
```
This will create a virtual environment directory venv.

To activate the virtual environment, run the following command:
```
. venv/Scripts/activate
```
This allows us to manage and resolve dependencies easily.
Install dependencies

To install all the required dependencies, run the following command:
```
pip install -r requirements.txt
```

Build Instructions

Setup environment variables
- Create a new file .env in the root directory
- Copy all the contents of .env.example into .env
- Replace all the required fields (xxxxxxx) specific to your configurations.
```
# Database Configurations
MONGODB_URL=xxxxxxxxxx
DATABASE_NAME=test
DATASET_COLLECTION_NAME=future_trends

# Token to be able to access protected routes
ACCESS_TOKEN=xxxxxxxx

# Parameters for Tip Prediction
SHIFT_INTERVAL_MINUTES=30
SHIFT_START_TIME = 10
SHIFT_END_TIME = 23
```
You can use any value for an access token. However, make sure any other application making a request to this service uses the same access token.

Installation

No additional steps are required for installation.

Run Instructions

Simply run the following command to start the server:

python main.py

The terminal will return the URL on where the application is running. By default, it runs on http://127.0.0.1:5000/.

If other applications need to use this service, use the provided URL.

Troubleshooting

ModuleNotFoundError: No module named '...'

This is usually because the script could not find the dependent libraries.

Make sure the virtual environment is running first.

If the problem still persists, re-install all the dependencies:
```
pip install -r requirements.txt
```
socket.error: [Errno 98] Address already in use

By default, Flask application starts on port number 5000. Since only one process can run on a port number, we cannot run this new processing to this occupies port. To fix this issue there are two approaches:
- Kill other process running on port number 5000
  - For Windows, view instructions here.
  - For Mac and Linux, view instructions here
- Run the application on a different port
  
  To run the application in a different unoccupied port, run the following command:
```
$ flask run -h localhost -p <PORT_NUMBER>
```
  Note, the application will now be available on the specified port number.

Developers Guide

Please see the installation guide to set up the project.

Utils

Generating Test Dataset

Use test-data-gen.py script to generate test dataset in csv format.

It accepts command line arguments to generate data as needed:

usage: test-data-gen.py [-h] [-s START_DATE] [-e END_DATE] [-f FILE_NAME]

Generate test dataset for a user

optional arguments:
  -h, --help            show this help message and exit
  -s START_DATE, --start-date START_DATE
                        First shift date in the dataset. Default: 1/1/2020
  -e END_DATE, --end-date END_DATE
                        Last shift date in the dataset. Default: 3/16/2021
  -f FILE_NAME, --file-name FILE_NAME
                        Name of the file to save the dataset. Default: shift_data.csv

An example usage would be:

python test-data-gen.py -s 1/1/2021 -e 3/1/2021 -f test_dataset.csv

Playground

We need to continuously improve our ML model using different features, aggregating different data, different algorithms, and so much more.

We are using Google Colab as a playground. Use TipTracker account to access it here.

API Usage

These API endpoints are protected and only authorized apps are allowed access based on the token issued.

The access token must be added using Bearer authentication.

$ curl https://service-url/api/...
    -H "Authorization: Bearer ${access_token}" 
    -H "Accept: application/json"

The following API endpoints are available:

`/predict_tips`

POST `/predict_tips`

This endpoint is used to predict tips for the user for the given days. The body parameters to request the data are following:

Parameter	Description	Required
user_id	`id` assigned to the user for whom we need to predict tips	Yes
dates	List of dates for which we need to predict tips. Each date should be formatted as `yyyy-MM-dd`.	Yes

Example Request Body:

{
    "user_id": 18,
    "dates": ["2021-02-20", "2021-02-21", "2021-02-22", "2021-02-23"]
}

If the request is successful, it returns the predicted values in a dictionary in the following scheme:

{
    // Each day represents the date passed in the request
    day: [
        // For a given day, different intervals that are specified by start and end time are given with predicted tip values
        {
            cash_tips,
            credit_card_tips,
            end_time,
            start_time
        },
        ...
    ],
    ...
}

If for a given day, the value is set to null, it means there isn't sufficient data to make predictions for that day.

Example Response:

{
    "result": {
        "2021-02-20": [
            {
                "cash_tips": 10.9,
                "credit_card_tips": 17.16,
                "end_time": "2021-04-01 10:30:00",
                "start_time": "2021-04-01 10:00:00"
            },
            {
                "cash_tips": 10.87,
                "credit_card_tips": 17.29,
                "end_time": "2021-04-01 11:00:00",
                "start_time": "2021-04-01 10:30:00"
            },
            {
                "cash_tips": 11.23,
                "credit_card_tips": 20.53,
                "end_time": "2021-04-01 11:30:00",
                "start_time": "2021-04-01 11:00:00"
            },
            ...
        ],
        "2021-02-21": [
            {
                "cash_tips": 15.35,
                "credit_card_tips": 14.4,
                ...
            },
            ...
        ],
        "2021-02-22": null,
        ...
}

If there are any errors, the response will contain a description of the error. An example error response is:

{
    "error": "We do not have sufficient data to make accurate predictions. Please continue entering shift data."
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.env.example		.env.example
.gcloudignore		.gcloudignore
.gitignore		.gitignore
README.md		README.md
app.yaml		app.yaml
configuration.py		configuration.py
database.py		database.py
main.py		main.py
predictor.py		predictor.py
regressor.py		regressor.py
requirements.txt		requirements.txt
test-data-gen.py		test-data-gen.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Release Notes

v1.0.1 (03/12/2021)

v1.0.2 (04/02/2021)

v.1.0.3 (04/23/2021)

Bug Fixes

Install Guide

Pre-prequisites

Download Instructions

Installing Dependent Libraries

Build Instructions

Installation

Run Instructions

Troubleshooting

Developers Guide

Utils

Generating Test Dataset

Playground

API Usage

`/predict_tips`

POST `/predict_tips`

About

Releases 1

Packages

Languages

JIA-0302/Analytics

Folders and files

Latest commit

History

Repository files navigation

Release Notes

v1.0.1 (03/12/2021)

v1.0.2 (04/02/2021)

v.1.0.3 (04/23/2021)

Bug Fixes

Install Guide

Pre-prequisites

Download Instructions

Installing Dependent Libraries

Build Instructions

Installation

Run Instructions

Troubleshooting

Developers Guide

Utils

Generating Test Dataset

Playground

API Usage

/predict_tips

POST /predict_tips

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

`/predict_tips`

POST `/predict_tips`

Packages