This worldwide pandemic impacts hundreds of thousands of people and causes thousands of deaths each day. Predicting the number of new cases (you can do the same for new deaths too) during this period can be a useful step in predicting the costs and facilities required in the future. Data from this https://www.data.gouv.fr supplied by Santé Publique France (Public Health France). This project aims to evaluate the performance and compare of linear and multiple non-linear regression techniques and neural network architecture, such as linear regression, support-vector regression (SVR), Random Forest Regressor,LSTM,RNN,for COVID-19 new cases rate prediction . The performance of reproduction rate prediction is measured mean squared error (MSE).
Time series forecasting problems should be re-framed as supervised learning problems. The important concepts that it's better you know.
- Time series as Suppervised Learning
- Univariate / Multivariate
- Single step / Multi step
- Stationary / Non Stationary
- Cross Validation in Time Series
- ACF Plot
- Lag Observation
- Feature Selection
- Handle Missing Value
Datacleaning
- Import raw dataset/Missing value/Correlation/Feature selection/Fill missing value with KNNbest-parameters
- Find best parameters for ML models withRandomizedSearchCV()
Forecasting_DL
- Import dataset/ checking for stationary / Split Train-Test / Convert to supervised / Scale Data / Define Models and fit / MSE / Predict next step / Result ComparisionForecasting_ML
- Import dataset/ checking for stationary / Split Train-Test / Convert to supervised / Scale Data / Define Models and fit / MSE / Predict next steplearning_curve
- plot of train and test scores for each model
- Learning Curve
- Long Short-Term Memory Layer
- Recurrent Neural Networks
- Linear Regression
- Support Vector Regression
- Random Forest Regressor