Skip to content

Clustering analysis of countries based on the COVID-19 cases

License

Notifications You must be signed in to change notification settings

morarez/Pandemic-Insights

Repository files navigation

Pandemic-Insights

The purpose of this project is to analyze the spread of Covid-19 in countries all around the world and classify them into different groups based on the similarity of their daily cases patterns.

Usage

Prerequisites

Make sure you installed Python 3.8 and all the libraries used in this project.

Run

$ python3.8 main.py

You will be asked to choose a method for creating the distane matrix. After that, you will be asked to choose a clustering algorithm you want to apply. At the end, the output will be:

  • List of all the countries with their labels
  • Silhouette score of the result
  • Graph of the result

Dataset

CSSE COVID-19 Time Series - This project uses time series table for the global confirmed cases from the JHU CSSE COVID-19 Dataset.

Note

Pandemic-Insights works with any version of this dataset. You can simply change the current one with another version of this dataset.

Results

One of the best results that could be obtained is shown below with graphs of 5 representatives of each cluster. Most of the clusters are mainly consists of countries that are geographically close to each other. Cluster 1 is mostly consists of Western European countries, cluster 2: mostly South American countries, cluster 3: mostly African countries, cluster 4: mostly Eastern European countries, cluster 5: China regions, cluster 6: mostly states of Australia.

1

2

3

4

5

6

7

About

Clustering analysis of countries based on the COVID-19 cases

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages