Skip to content

Code Repository for Federated Learning in Chemical Engineering: A Tutorial on a Framework for Privacy-Preserving Collaboration Across Distributed Data Sources. The paper can be found here:

License

Notifications You must be signed in to change notification settings

elucidator8918/Fed-ML-Chem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Federated Learning in Chemical Engineering: A Tutorial on a Framework for Privacy-Preserving Collaboration Across Distributed Data Sources

Centered Image

Overview

Welcome to the Review on Federated Learning (FL) for Chemical Engineering repository. Federated Learning is where multiple decentralized devices collaboratively train a model without sharing their local data. Each device trains the model on its own data and only shares the model updates.

A Review paper is coming soon for the Industrial & Engineering Chemistry Research Journal! - Link - {WIP}

The comprehensive results section can be seen:

Experiment Results

Experiment Dataset Train Accuracy Test Accuracy Train Loss Test Loss Time (sec)
Centralized Learning Experiments (Num Epochs: 25)
PILL - Standard_CentralNN_PILL.ipynb PILL: Pharmaceutical Dataset 93.75% 82.39% 0.2600 0.5700 226.75
DNA - Standard_CentralNN_DNA.ipynb Human DNA Sequence 100.00% 94.50% 0.0000 0.5080 97.49
DNA (MMoE) - Standard_CentralNN_DNA+MRI.ipynb Brain MRI Scan + Human DNA Sequence 99.24% 84.64% 0.0274 0.7513 350.32
MRI (MMoE) - Standard_CentralNN_DNA+MRI.ipynb Brain MRI Scan + Human DNA Sequence 99.72% 90.63% 0.0115 0.8555
HIV - Standard_CentralNN_HIV.ipynb Human DNA Sequence 96.77% 95.45% 0.1640 0.1770 100.79
Federated Learning Experiments (Num Rounds: 20, Num Clients: 10, Num Epochs: 10)
PILL - Standard_FedNN_PILL.ipynb PILL: Pharmaceutical Dataset 93.54% 94.79% 0.2800 0.2070 2215.40
DNA - Standard_FedNN_DNA.ipynb Human DNA Sequence 100.00% 94.09% 0.0000 1.2030 3921.43
DNA (MMoE) - Standard_FedNN_DNA+MRI.ipynb Brain MRI Scan + Human DNA Sequence 99.00% 94.75% 0.0776 0.4167 5543.29
MRI (MMoE) - Standard_FedNN_DNA+MRI.ipynb Brain MRI Scan + Human DNA Sequence 99.38% 85.56% 0.1997 1.0720
HIV - Standard_FedNN_HIV.ipynb Human DNA Sequence 96.31% 95.34% 0.1790 0.1870 1042.82

Results of Central and FL Experiments. In FL, both train loss and accuracy refer to a single client.

Repository Structure

.
├── LICENSE
├── README.md
├── run-cpu.sh
├── run-gpu.sh
├── run-tff-cpu.sh
├── run-tff-gpu.sh
└── src
    ├── README.md
    ├── Standard_CentralNN_DNA.ipynb
    ├── Standard_CentralNN_DNA+MRI.ipynb
    ├── Standard_CentralNN_HIV.ipynb
    ├── Standard_CentralNN_PILL.ipynb
    ├── Standard_CentralNN_Wafer.ipynb
    ├── Standard_FedNN_DNA.ipynb
    ├── Standard_FedNN_DNA+MRI.ipynb
    ├── Standard_FedNN_HIV.ipynb
    ├── Standard_FedNN_PILL.ipynb
    ├── Standard_FedNN_Wafer.ipynb
    ├── Standard_FedQNN_DNA.ipynb
    ├── Standard_FedQNN_DNA+MRI.ipynb
    ├── Standard_FedQNN_HIV.ipynb
    ├── Standard_FedQNN_PILL.ipynb
    ├── Standard_FedQNN_Wafer.ipynb
    ├── TFF_FedNN_PILL.ipynb
    └── utils
        ├── common.py
        ├── data_setup.py
        ├── engine.py
        └── __init__.py

Installation

Clone the Repository

git clone https://github.com/elucidator8918/Fed-ML-Chem.git
cd Fed-ML-Chem

Install Dependencies

For CPU
conda create -n fed python=3.10.12 anaconda
conda init
conda activate fed
bash run-cpu.sh
For GPU
conda create -n fed python=3.10.12 anaconda
conda init
conda activate fed
bash run-gpu.sh

Running Experiments

Choose the appropriate notebook based on your dataset and encryption preference:

Standard Federated Learning
  • DNA Sequence Dataset:

    • Notebook: src/Standard_FedNN_DNA.ipynb
    • Description: This dataset includes DNA sequences used for various biological and genetic studies, focusing on sequence classification and pattern recognition.
  • DNA+MRI Multimodal Dataset:

    • Notebook: src/Standard_FedNN_DNA+MRI.ipynb
    • Description: It is used as a MoE with Multimodaility leveraging both DNA Sequence and MRI scans data to develop and evaluate models for detecting and interpreting tumors and dna classes.
  • PILL Dataset:

    • Notebook: src/Standard_FedNN_PILL.ipynb
    • Description: This dataset includes images of pharmaceutical pills from the Pill Dataset used for various pharmaceutical studies, focusing on classification and pattern recognition.
Standard Quantum Federated Learning
  • DNA Sequence Dataset:

    • Notebook: src/Standard_FedQNN_DNA.ipynb
    • Description: This dataset includes DNA sequences used for various biological and genetic studies, focusing on sequence classification and pattern recognition.
  • DNA+MRI Multimodal Dataset:

    • Notebook: src/Standard_FedQNN_DNA+MRI.ipynb
    • Description: It is used as a MoE with Multimodaility leveraging both DNA Sequence and MRI scans data to develop and evaluate models for detecting and interpreting tumors and dna classes.
  • PILL Dataset:

    • Notebook: src/Standard_FedQNN_PILL.ipynb
    • Description: This dataset includes images of pharmaceutical pills from the Pill Dataset used for various pharmaceutical studies, focusing on classification and pattern recognition.

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

Code Repository for Federated Learning in Chemical Engineering: A Tutorial on a Framework for Privacy-Preserving Collaboration Across Distributed Data Sources. The paper can be found here:

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages