A project used to compare open source vs paid technologies when using weather data to predict train disruptions.
The comparions is currently between databricks and kubernetes.
(bonus: azure machine learning studio)
For the databricks approach see the azure_databricks_version folder.
- Clone the repository
git clone [email protected]:cinqict/paid-vs-opensource.git
- Install dependencies
pip install -r requirements.txt
- Run the app
streamlit run app.py
- Clone the repository
git clone [email protected]:cinqict/paid-vs-opensource.git
- Build the docker image
docker compose --env-file=./.streamlit/secrets.toml build
- Run the docker image
docker run -p 8080:8080 weather-dash-i
- mysql, upload sql data (
sql_upload.py), middleware, dashboard
minikube startkubectl apply -f kubectl_deploy/mysql-pv.yamlcreate the persistent volumekubectl apply -f kubectl_deploy/mysql-pvc.yamlcreate the persistent volume claimkubectl apply -f kubectl_deploy/mysql-deployment.yamlcreate the deploymentkubectl apply -f kubectl_deploy/mysql-service.yamlkubectl describe deployment mysqlcheck the deployment is runningkubectl get pods -l app=mysqlcheck the pod is runningkubectl describe pvc mysql-pv-claimcheck the persistent volume claim is running- Optional:
kubectl run -it --rm --image=mysql:8.0 --restart=Never mysql-client -- mysql -h mysql -ppasswordconnect to the mysql podSTATUS; SHOW DATABASES;check the status and what databases are runningSELECT table_name FROM information_schema.tables;check the available tables- exit with
ctrl + d
kubectl expose pod mysql-**********-****** --type=LoadBalancer --port=3306expose the pod as a serviceminikube tunnelneeds to be run in a separate terminal (optional; necessary if you are using minikube)kubectl get servicescheck the service is running
minikube startcd model_apieval $(minikube docker-env)set the docker environment to minikubedocker compose buildbuild the docker imagekubectl run model-api --image=xgboost-api-i --image-pull-policy=Nevercreate the deploymentkubectl get podscheck the pod is runningkubectl expose pod model-api --type=LoadBalancer --port=8000expose the pod as a servicekubectl get servicescheck the service is running- because
minikube tunnelis running from before you can now access the api athttp://localhost:8000/docs
- Install minikube
brew install minikube
minikube startminikube dashboard(optional) needs to be run in a separate terminal- create a pod from the above docker image
eval $(minikube docker-env)set the docker environment to minikubedocker compose --env-file=./.streamlit/secrets.toml buildkubectl run weather-dash --image=weather-dash-i --image-pull-policy=Neverkubectl get podscheck the pod is running
- expose the pod as a service
kubectl expose pod weather-dash --type=LoadBalancer --port=8080kubectl get servicescheck the service is running
- because
minikube tunnelis running from before you can now access the dashboard athttp://localhost:8080 - if you have issues the model not being able to be called it is likely due to changes in the IP address/ port of the middleware service. To fix this:
- go to the function
utils.get_disruption_prediction()and update the offending uris
- go to the function
kubectl delete pods --alldelete all podskubectl delete services --alldelete all serviceskubectl delete deployments --alldelete all deploymentsminikube stopstop minikubeminikube delete --alldelete minikube

