COVID-19-Confirmed-Death-and-Recovered-Case-Predictions-for-US

COVID-19-Confirmed, Death and Recovered Case Predictions for US (As a part of Assignments in Data And Knowledge Management Course at University of Waterloo)

Steps Implemented -

Technologies: Python, Keras, Scikit-learn, Pandas, Numpy
Data preprocessing steps__:
- Checking for Missing values in Columns.
- Checking for duplicate records and dropping it if any.
- Removing features that are highly dependent upon each other. In Covid Dataset we have [State ID], so we do not need [State, Long, Lat] and dropping these features.
- Type Casting the [Resident Population 2020 Census] and [Population Density 2020 Census] data into float data type.
- Adding the relative difference of specific quanititative attributes with respect to the state.
- Checking for outliers - Data point that differs significantly from other observations.By plotting Histograms, we look at data distribution for a variable and find values that fall outside the distribution.
- Performing Z Score Regularization - score helps to understand if a data value is greater or smaller than mean and how far away it is from the mean. If the Z score of a data point is more than 3, it indicates that the data point is quite different from the other data points. Such a data point can be an outlier.
- Outliers Removal: - Removing those rows that have [Incident_Rate] greater than 2.5 Z-Score value or lesser than -2.5 Z-Score value Removing those rows that have [Case_Fatality_Ratio] greater than 3 Z-Score value or lesser than -3 Z-Score value.
- Applying PCA on the covid features and creating a hybrid dataset with a combination of original features along with first five PCA components.
Used hyperparameter selection
Segregated into 2 parts -

Part 1:
Applied Machine learning algorithms (Decision Tree, Naive Bayes, Random Forest, XgBoost and GradientBoost) and compared their performance
Part 2:
Applied Deep learning techniques (Deep Neural Networks Model, Custom - LSTM )

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
DeepLearningModelsImplementation.ipynb		DeepLearningModelsImplementation.ipynb
MachineLearningModelsImplementation.ipynb		MachineLearningModelsImplementation.ipynb
README.md		README.md
dkma_output_submit.csv		dkma_output_submit.csv
dkmacovid_kaggletest_features.csv		dkmacovid_kaggletest_features.csv
dkmacovid_train.csv		dkmacovid_train.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

COVID-19-Confirmed-Death-and-Recovered-Case-Predictions-for-US

About

Releases

Packages

Languages

snigdhakakkar/COVID-19-Confirmed-Death-and-Recovered-Case-Predictions-for-US

Folders and files

Latest commit

History

Repository files navigation

COVID-19-Confirmed-Death-and-Recovered-Case-Predictions-for-US

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages