Skip to content

Latest commit

 

History

History
6 lines (5 loc) · 790 Bytes

README.md

File metadata and controls

6 lines (5 loc) · 790 Bytes

lyme-disease-classifier

Lyme Disease is the the (second) fastest growing contagious disease in the world. For this project, I built a climate-based classification model with a ROC AUC of 0.96 to predict which US counties will have a high incidence of Lyme Disease. This was done by:

  • Engineering a dataset from scratch by merging Center for Disease Control data with National Oceanic and Atmospheric Administration climate data which was parsed from 78,000 csv files.
  • Using K-Nearest Neighbors, Logistic Regression, Support Vector Machines and Random Forest algorithms optimized with grid search.

Note: This project was completed in 2019, before the emergence of COVID-19. At the time, it was the fastest growing contagious disease. An amendment was made above to account for this.