This project aims to use unsupervised machine learning to analyze a database of cryptocurrencies and create a report including the traded cryptocurrencies classified by group according to their features. An investment bank could use this classification report to propose a new cryptocurrency investment portfolio to its clients. We use the following methods for the analysis:
- Preprocessing the database.
- Reducing the data dimension using Principal Component Analysis.
- Clustering cryptocurrencies using K-Means.
- Visualizing classification results with 2D and 3D scatter plots.
- Preprocessing the database.
- Reducing Data Dimensions with PCA.
- Clustering Cryptocurrencies using K-Means - Elbow Curve
- Visualizing classification results with 2D and 3D scatter plots.
Following the preprocessing instructions and cleaning phase, a total of 532 tradable cryptocurrencies were left.
The PCA algorithm was used for reducing the data after the preprocessing, in this case the components number was 90.
The purpose of K-means algorithm was to predict the K clusters for the cryptocurrencies. Also, an elbow curve was produced using the K-Means method iterating on K values from 1 to 10.
2D - Scatter plot with clusters
After the Custer analysis with K-means, the result determined four clusters to be considered tradable cryptocurrencies.