MLOps Zoomcamp Project

Overview

This project consists of an application that forecasts the mobility trends for retail & recreation in Portugal using Google's Community Mobility Reports. According to Google, this includes places like restaurants, cafes, shopping centers, theme parks, museums, libraries, and movie theaters.

This project aims to be a proof of concept for an application that allows restaurant owners to have better predictions on consumer demand.

Currently, the model only leverages past mobility data but in the future the objective is to integrate other data sources such as holidays and weather forecasts.

The main focus of the project is on the operations side which is what the course mainly covers. As can be seen on the diagram below, the project leverages Prefect as an orchestration engine to run the predictions everyday for the following days which are stored on a bucket. The model is automatically re-trained on a monthly basis to adjust for distribution shifts. Model training is integrated with MLFlow for tracking and registry. All of these operations run on GCP.

Guide

MLFlow

MLFlow is used to store the models as well as their corresponding metadata. This is specially important given the monthly retraining, it's important to keep track of the several models that are being trained on a standard location and have an easy way to roll back if necessary.

I decided to set it up using Cloud Run and I mostly followed this blogpost. My implementation is on the mlflow submodule.

Prefect

Prefect is going to be the brain of our app, it's where the monthly retraining and the daily inference will occur.

Let's start by creating a VM on GCP and setting up the environment:

conda create -n project python=3.9
conda activate project
pip install -r requirements.txt

Afterwards we need to spin the Prefect server with:

prefect orion start

Then we need to build the YAMLs:

prefect deployment build train.py:train_flow --name model_train
prefect deployment build predict.py:train_flow --name model_predict

Then we can deploy the flows:

prefect deployment apply model_train-deployment.yaml 
prefect deployment apply model_predict-deployment.yaml

Finally, we start an agent that will run the deployed flows:

prefect agent start -q 'forecasting'

Future Improvements

Add model monitoring (e.g: EvidentlyAI or WhyLabs)
Add IAC with Terraform
Add a UI to visualize data and predictions (e.g: Streamlit)
Improve best practices with tests and CI/CD

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
mlflow-for-gcp @ 638655e		mlflow-for-gcp @ 638655e
prefect		prefect
resources		resources
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLOps Zoomcamp Project

Overview

Guide

MLFlow

Prefect

Future Improvements

About

Releases

Packages

Languages

TSFelg/mlops_zoomcamp_project

Folders and files

Latest commit

History

Repository files navigation

MLOps Zoomcamp Project

Overview

Guide

MLFlow

Prefect

Future Improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages