This repository contains implementations of Machine Learning algorithms and models from scratch. The purpose of this repository is to provide a simplistic form of these algorithms and match them to their respective libraries in Python.
Statistical Models
- Linear Regression – Basic regression model, foundation for learning.
- Logistic Regression – Basic classification model.
- Ridge and Lasso Regression – Regularized versions of Linear Regression.
- K-Nearest Neighbors (KNN) – Simple non-parametric model for classification/regression.
- Naive Bayes – Probabilistic model based on Bayes' theorem.
- K-Means – Basic clustering algorithm.
- T-SNE - t-Distributed Stochastic Neighbor Embedding, dimensionality reduction technique.
- Singular Value Decomposition (SVD) – Matrix factorization for dimensionality reduction.
- Principal Component Analysis (PCA) – Dimensionality reduction.
- Decision Trees – Simple interpretable model for classification/regression.
- Random Forests – Ensemble of Decision Trees, improves accuracy.
- Support Vector Machines (SVM) – Powerful classification/regression model.
- XGBoost – State-of-the-art boosting algorithm for classification/regression.
- Bayesian Networks – Probabilistic graphical model.
- Markov Decision Processes (MDPs) – Framework for modeling decisions, often used in RL.
Neural Network based Models:
- Neural Networks (NNs) – Foundation for deep learning, learning weights and activations.
- Backpropagation – Essential algorithm for training NNs.
- Gradient Descent – Optimization method used in NNs.
- Convolutional Neural Networks (CNNs) – Specialized NNs for image data.
- Recurrent Neural Networks (RNNs) – NNs for sequential data.
- Long Short-Term Memory Networks (LSTMs) – Improved RNNs, handling long sequences.
- Gated Recurrent Units (GRUs) – Another variant of RNNs, simpler than LSTMs.
- Transformer Networks – State-of-the-art model for sequential data (e.g., NLP).
- Autoencoders – NNs for unsupervised learning and dimensionality reduction.
- Generative Adversarial Networks (GANs) – NNs for generating new data.
- Reinforcement Learning – Learning to take actions in an environment.
- Q-Learning – Used in Reinforcement Learning; can involve neural nets.
- Clone the repository
- Install the required dependencies
- Run the desired notebook
I am doing this to learn more about the inner workings of Machine Learning algorithms and models. I believe that by implementing these algorithms from scratch, I will have a better understanding of how they work and how they can be improved.
While libraries like scikit-learn
and tensorflow
are great for building Machine Learning models quickly, they abstract away a lot of the details of how these models work. By implementing these models from scratch, I can gain a deeper understanding of how they work and how they can be improved.
I will be comparing the results of my implementations to the results of the corresponding libraries in Python. If the results match, then I can be confident that my implementation is correct.
Send me an email, I can tell you faster ways to get a headache :)
If you would like to contribute to this repository, please open an issue or a pull request.
This repository is licensed under the MIT License. Do whatever you want with it!