Skip to content

This repository contains code and data files for projects completed throughout the Galvanize Immersive Data Science program.

Notifications You must be signed in to change notification settings

kristiewirth/galvanize-portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Galvanize Portfolio

This repository contains code and data files for projects completed throughout the Galvanize Immersive Data Science program.

Project Goals

"You are a contract data scientist/consultant hired by a new e-commerce site to try to weed out fraudsters. The company unfortunately does not have much data science expertise so you must properly scope and present your solution to the manager before you embark on your analysis. Also, you will need to build a sustainable software project that you can hand off to the companies engineers by deploying your model in the cloud."

Technologies Used

  • Python
  • Neural network (MLP)
  • Flask
  • HTML
  • Pandas
  • Numpy
  • Seaborn

Collaborators

Project Goals

"You and a team of talented data scientists are working for the company, Items-Legit, who use several production recommenders that provide a significant revenue stream. The issue is that these systems have been around a long time and your head of data science has asked you and your team members to explore new solutions. The main goal here is to improve the RMSE, however, another equally important goal is to present your model and the details of your methods in a clear, but technically sound manner. We would also like you to include some discussion about how you would move from prototype to production."

Technologies Used

  • Python
  • Spark

Collaborators

Project Goals

"A ride-sharing company (Company X) is interested in predicting rider retention. To help explore this question, we have provided a sample dataset of a cohort of users who signed up for an account in January 2014. The data was pulled on July 1, 2014; we consider a user retained if they were “active” (i.e. took a trip) in the preceding 30 days (from the day the data was pulled). In other words, a user is "active" if they have taken a trip since June 1, 2014. We would like you to use this data set to help understand what factors are the best predictors for retention, and offer suggestions to operationalize those insights to help Company X. Therefore, your task is not only to build a model that minimizes error, but also a model that allows you to interpret the factors that contributed to your predictions."

Technologies Used

  • Python
  • Regression analysis
  • Decision trees
  • K nearest neighbors
  • Pandas
  • Seaborn

Collaborators

About

This repository contains code and data files for projects completed throughout the Galvanize Immersive Data Science program.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published