ECM: Electrocardiogram Classification Model and ATML: Automated Model Structure and Hyperparameter Optimization for Unsupervised Training Framework
简体中文 | English
Training Framework Functional Structure | Directory Structure | Usage | Installation | Project Owner | Disclaimer
The Electrocardiogram Classification Model (ECM) is a deep learning model designed to classify ECG signals into two categories: Normal and Atrial Fibrillation (AFIB).
AutoTuneML (ATML) is an unsupervised learning framework that automatically adapts to both CPU and CUDA. It aims to simplify and accelerate the development of machine learning models, primarily through the following two features:
-
Automatically search and optimize model structures
-
Automatically adjust hyperparameters (currently supports learning rate)
This helps users achieve optimal model performance without manual intervention, while maintaining flexibility and ease of use through modular design.
For loongarch deployment, please visit deploy_loongarch.
-
Loop Search:
- First Phase:
Store various candidate model structures in
model.py
and train them usingtrain_muti.py
, storing the results inmulti-result
. The framework will automatically try different learning rates, iterate through the models, and save the optimal model. This phase uses performance metrics as the iteration criterion but also records the size of the ONNX files to approximate inference speed.- Second Phase:
Users select the appropriate model structure from
multi-result
based on performance metrics and file size, and import it intoss_model.py
as the base model structure for this phase. Then, set the expected performance metrics and file size, and train usingtrain_final.py
. The framework will automatically adjust the number of channels in convolutional and fully connected layers within the specified range until both performance and file size meet the criteria. This phase uses both performance metrics and file size as iteration criteria.Advantages:
-
Given multiple base models, the framework can help users find the most suitable model structure for the task through the first phase of Loop Search.
-
In the second phase of Loop Search, the framework can automatically find the balance between performance metrics and inference speed by adjusting the model complexity.
-
Saves a significant amount of time on structural tuning, with a very fast model evolution speed.
-
Early Stop:
To prevent overfitting, the framework adopts an early stop strategy. Unlike traditional early stop logic that only monitors
val_loss
, this framework uses a new early stop logic with better performance. -
Hyperparameter Pruning:
To save computational resources and time, the framework employs a hyperparameter pruning strategy. It prunes ineffective hyperparameters and their neighborhoods to improve training efficiency.
The project's directory structure is as follows:
├── 93-89 # Results of the second phase of Loop Search (xx-yy, where xx is the maximum value of various metrics, and yy is the minimum value)
├── avoid.txt # Definition of hyperparameter pruning
├── multi-result # Results of the first phase of Loop Search
├── params.py # Parameter settings
├── public # Classes and definitions
│ ├── dataset.py # Dataset definition
│ ├── model.py # Model definition for the first phase of Loop Search
│ ├── ss_model.py # Model definition for the second phase of Loop Search
│ └── test.py # Test function definition
├── temp # Temporary files for the second phase of Loop Search
├── test_data # Official test set
├── train_data # Official training set
├── train_final.py # Second phase of Loop Search
├── train_muti.py # First phase of Loop Search
└── ttdata # Demonstration dataset
-
Data Preparation: Place the training and test data in the
train_data/
andtest_data/
directories, respectively. (The default is the demonstration datasetttdata
) -
Parameter Settings: Set hyperparameters in
params.py
. -
Model Definition: Define the model structure in
public/model.py
and register the models at the end of the file inmodel_classes
. -
Training: Run the
train_multi.py
script to start training. During training, the models and metric records will be saved in themulti-result
directory. Select an appropriate model structure and import it intoss_model.py
. Set the desired performance metrics and file size, and then run thetrain_final.py
script to start the second phase of training. The results will be saved in thexx-yy
directory.
To install the required dependencies, you can create a new conda environment with the following command:
conda create -n ECM&ATML python=3.10 pytorch==1.2.2 torchvision==0.17.2 torchaudio==2.2.2 pytorch-cuda=12.1 numpy scikit-learn pandas tqdm onnx -c pytorch -c nvidia
- JokerJostar (Lin Yuqi)
[email protected]
Class of 2023 Freshman
This project was independently developed by me from scratch starting on May 19, 2024, for the IESD-2024 competition. I welcome any discussions and exchanges.