Skip to content

Commit

Permalink
Merge pull request #214 from priyansh985/main
Browse files Browse the repository at this point in the history
enhancement Add Optimization Techniques with Intel Specific Optimization in Intel_Optimization.md
  • Loading branch information
Mayureshd-18 authored Nov 8, 2024
2 parents ff16e1b + 446b5ec commit 1464ab1
Show file tree
Hide file tree
Showing 7 changed files with 163 additions and 14 deletions.
39 changes: 39 additions & 0 deletions .github/workflows/python-app.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# This workflow will install Python dependencies, run tests and lint with a single version of Python
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python

name: Python application

on:
push:
branches: [ "main" ]
pull_request:
branches: [ "main" ]

permissions:
contents: read

jobs:
build:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4
- name: Set up Python 3.10
uses: actions/setup-python@v3
with:
python-version: "3.10"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install flake8 pytest
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Lint with flake8
run: |
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test with pytest
run: |
pytest
92 changes: 92 additions & 0 deletions Intel_Optimized/Intel_Optimization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# Optimization with Intel oneAPI AI Analytics Toolkit

## 📝 **Overview**

This repository's stock price prediction model has been optimized using Intel's oneAPI AI Analytics Toolkit, specifically leveraging:
- **Intel Distribution for scikit-learn**
- **Modin Distribution for parallelized Pandas operations**
- **Intel oneDAAL library for accelerated machine learning algorithms**

## 🚀 **Key Optimizations**

### **Intel scikit-learn Distribution**

Utilized optimized algorithms for:
- **Linear Regression**
- **Decision Trees**
- **Random Forest**

![Intel scikit-learn vs Normal scikit-learn](./images/scikit-learn-acceleration.png)

### **Modin Distribution**

Parallelized Pandas operations for:
- **Data loading and preprocessing**
- **Data transformation and feature engineering**

![Pandas vs Modin](./images/modin-and-pandas-performance.png)

### **Intel oneDAAL Library**

Accelerated machine learning algorithms for:
- **Principal Component Analysis (PCA)**
- **K-Means Clustering**
- **Linear Regression**

## 🎯 **Benefits**

- **Improved Performance**: Up to **[30 and more]%** reduction in training/inference time.
- **Enhanced Scalability**: Efficiently handle large datasets and complex models.
- **Increased Accuracy**: Optimized algorithms for improved prediction accuracy.

## 📋 **Requirements**

- **Intel oneAPI AI Analytics Toolkit installed**
- **Compatible Intel hardware** (e.g., Intel Core processors, Intel Xeon Scalable processors)

## 🛠️ **Usage**

1. **Clone the repository.**
2. **Install [Intel oneAPI AI Analytics Toolkit](https://github.com/intel/aikit-operator).**
3. **Install [Intel Distribution for scikit-learn](https://intel.github.io/scikit-learn-intelex/) and [Modin](https://modin.readthedocs.io/en/latest/).**
4. **Build and run the optimized model using the provided instructions.**
5. **Alternatively, you can install individual components using pip:**
```bash
pip install scikit-learn-intelex
pip install modin[all]
```

## 💻 **Code Snippets**
```python
# Import necessary libraries
from sklearnex import patch_sklearn
import modin.pandas as pd
from daal4py import PCA
# Patch scikit-learn to use Intel optimizations
patch_sklearn()
# Example: Load data using Modin
df = pd.read_csv('stock_prices.csv')
# Example: Preprocess data
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)
# Example: Train a Linear Regression model using Intel optimized scikit-learn
from sklearn.linear_model import LinearRegression
X = df[['Open', 'High', 'Low', 'Volume']]
y = df['Close']
model = LinearRegression()
model.fit(X, y)
# Example: Perform PCA using Intel oneDAAL
pca = PCA(n_components=2)
pca_result = pca.fit_transform(X)
print("PCA Result:", pca_result)
```

### **Python**
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions PROJECT_STRUCTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,10 +30,14 @@
│ └── hacktober.png
├── Intel_Optimized/
│ ├── ARIMA_V2.ipynb
│ ├── Intel_Optimization.md
│ ├── Stock_Price_Prediction_1.ipynb
│ ├── Stock_prediction_Data_Analysis.ipynb
│ ├── buy_sell_recommendation_system.ipynb
│ ├── hybrid.ipynb
│ ├── images/
│ │ ├── modin-and-pandas-performance.png
│ │ └── scikit-learn-acceleration.png
│ ├── readme.md
│ ├── reduced_redundancy_stock_price_prediction.ipynb
│ └── requirements.txt
Expand Down
38 changes: 24 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,20 +38,27 @@ Check the project structure here [Project Structure](PROJECT_STRUCTURE.md)

## 📚 Table of Contents

1. [🌟 Overview](#-overview)
2. [🛠️ Features](#️features)
3. [🔍 Algorithms Used](#-algorithms-used)
4. [📊 Dataset](#-dataset)
5. [📁 Project Structure](#-project-structure)
6. [🚀 How to Run](#-how-to-run)
7. [📈 Results](#-results)
8. [📊 Performance Metrics](#-performance-metrics)
9. [🔮 Future Work](#-future-work)
10. [🏆 Conclusion](#-conclusion)
11. [✍️ Author](#-author)
12. [🤝 Contributing](#-contributing)
13. [🌍 Our Valuable Contributors](#-our-valuable-contributors)
14. [📝 License](#-license)
- [📈 GitHub Repository Stats](#-github-repository-stats)
- [This project is now OFFICIALLY accepted for](#this-project-is-now-officially-accepted-for)
- [✨ Project Structure](#-project-structure)
- [📚 Table of Contents](#-table-of-contents)
- [🌞 Overview](#-overview)
- [🛠️ Features](#️-features)
- [🔍 Algorithms Used](#-algorithms-used)
- [📊 Dataset](#-dataset)
- [📁 Project Structure](#-project-structure-1)
- [🚀 How to Run `main.py`](#-how-to-run-mainpy)
- [📈 Results](#-results)
- [📊 Performance Metrics](#-performance-metrics)
- [🔧 Optimization Techniques](#-optimization-techniques)
- [🔮 Future Work](#-future-work)
- [🏆 Conclusion](#-conclusion)
- [✍️ Author](#️-author)
- [🤝 Contributing](#-contributing)
- [🌍 Our Valuable Contributors](#-our-valuable-contributors)
- [🎉 Thank You to All Our Amazing Contributors! 🎉](#-thank-you-to-all-our-amazing-contributors-)
- [📝 License](#-license)
- [📱 Connect with Us](#-connect-with-us)

---
## 🌞 Overview
Expand Down Expand Up @@ -146,6 +153,9 @@ The **Mean Absolute Percentage Error (MAPE)** of all the following 10 Regression

<img src="images/6c9ebb5b-a8ed-44de-8842-bf8f5c25990f.jpeg" alt="Performance-Metrices" width="400" height="300">

## 🔧 Optimization Techniques

We have applied several Intel-specific optimization techniques to enhance the performance of our models. For detailed information, please refer to the [Optimization Techniques](./Intel_Optimized/Intel_Optimization.md) document.

## 🔮 Future Work

Expand Down
4 changes: 4 additions & 0 deletions repo_structure.txt
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,14 @@
│ └── hacktober.png
├── Intel_Optimized/
│ ├── ARIMA_V2.ipynb
│ ├── Intel_Optimization.md
│ ├── Stock_Price_Prediction_1.ipynb
│ ├── Stock_prediction_Data_Analysis.ipynb
│ ├── buy_sell_recommendation_system.ipynb
│ ├── hybrid.ipynb
│ ├── images/
│ │ ├── modin-and-pandas-performance.png
│ │ └── scikit-learn-acceleration.png
│ ├── readme.md
│ ├── reduced_redundancy_stock_price_prediction.ipynb
│ └── requirements.txt
Expand Down

0 comments on commit 1464ab1

Please sign in to comment.