The purpose of this project is to analyze the spread of Covid-19 in countries all around the world and classify them into different groups based on the similarity of their daily cases patterns.
Make sure you installed Python 3.8 and all the libraries used in this project.
$ python3.8 main.py
You will be asked to choose a method for creating the distane matrix. After that, you will be asked to choose a clustering algorithm you want to apply. At the end, the output will be:
- List of all the countries with their labels
- Silhouette score of the result
- Graph of the result
CSSE COVID-19 Time Series - This project uses time series table for the global confirmed cases from the JHU CSSE COVID-19 Dataset.
Pandemic-Insights works with any version of this dataset. You can simply change the current one with another version of this dataset.
One of the best results that could be obtained is shown below with graphs of 5 representatives of each cluster. Most of the clusters are mainly consists of countries that are geographically close to each other. Cluster 1 is mostly consists of Western European countries, cluster 2: mostly South American countries, cluster 3: mostly African countries, cluster 4: mostly Eastern European countries, cluster 5: China regions, cluster 6: mostly states of Australia.