K-Means Clustering on Stocks

This project involves the manual implementation of the K-Means clustering algorithm applied to stocks. The primary objective of this project was to conduct personal research and gain a deeper understanding of clustering techniques and their applications in financial data analysis.

Try it out here!

Project Overview

The K-Means algorithm is a popular unsupervised learning method used for clustering data into distinct groups based on their features. In this project, I manually implemented the K-Means algorithm from scratch, without relying on pre-built libraries, to better understand its inner workings and nuances.

Features

Interactive Stock Selection:
- Users can select multiple stocks from the NIFTY 50 index and add custom stocks for analysis
- Flexible timeframe selection to analyze different time periods
- Ability to specify the number of clusters (k) for the analysis
Real-time Clustering:
- Dynamic clustering of selected stocks based on user parameters
- Interactive visualization showing cluster assignments and centroids
Data Analysis Tools:
- Detailed view of financial metrics for each stock
- Cluster-wise analysis showing common characteristics
- Visual tracking of clustering iterations and convergence

Implementation Details

Data Collection:
- Historical stock price data was collected for NIFTY 50 stocks using the yfinance library.
- The data includes daily closing prices, which were used to calculate various financial metrics.
Data Preprocessing:
- The collected data was cleaned and preprocessed to handle missing values and normalize the features.
- From the data, we derived the following features: Mean Returns, Volatility, and Sharpe Ratio.
Manual K-Means Implementation:
- Initialize the centroids randomly from the data points.
- The algorithm iteratively assigned each data point to the nearest centroid and updated the centroids based on the mean of the assigned points.
- The process was repeated until the centroids stabilized or a maximum number of iterations was reached.

How to Run

Clone the repository.
Install the required packages using pip install -r requirements.txt.
Run the Jupyter notebook KMeans.ipynb to see the implementation and results.
Launch the Streamlit application using the command streamlit run KMeans.py to interact with the clustering results.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.gitignore		.gitignore
KMeans.ipynb		KMeans.ipynb
KMeans.py		KMeans.py
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

K-Means Clustering on Stocks

Project Overview

Features

Implementation Details

How to Run

About

Releases

Packages

Contributors 2

Languages

upspal/K-Means-on-Stocks

Folders and files

Latest commit

History

Repository files navigation

K-Means Clustering on Stocks

Project Overview

Features

Implementation Details

How to Run

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages