Cardiovascular signals such as photoplethysmography (PPG), electrocardiography (ECG), and blood pressure (BP) are inherently correlated and complementary, together reflecting the health of cardiovascular system. However, their joint utilization in real-time monitoring is severely limited by diverse acquisition challenges from noisy wearable recordings to burdened invasive procedures. Here we propose UniCardio, a multi-modal diffusion transformer that reconstructs low-quality signals and synthesizes unrecorded signals in a unified generative framework. Its key innovations include a specialized model architecture to manage the signal modalities involved in generation tasks and a continual learning paradigm to incorporate varying modality combinations. By exploiting the complementary nature of cardiovascular signals, UniCardio clearly outperforms recent task-specific baselines in signal denoising, imputation, and translation. The generated signals match the performance of ground-truth signals in detecting abnormal health conditions and estimating vital signs, even in unseen domains, while ensuring interpretability for human experts. These advantages position UniCardio as a promising avenue for advancing AI-assisted healthcare.
The official implementation codes are here.
Install dependencies:
pip install -r requirements.txtor
conda env create -f environment.ymlDownload from Cuffless BP .
Download from PTBXL.
Download from MIMIC.
Download from MIMIC PERform AF .
Download from WESAD .
Download the pretrained model and place it in:
UniCardio/base_model/no_compress799.pthUniCardio is using dataparallel training and can be adopted to distributed.
python train_original.pyTo test a pretrained model:
python test_final.py- Unified Generative Framework:
A single model that performs versatile tasks like signal denoising, imputation, and translation across multiple cardiovascular signals (e.g., PPG, ECG, and BP). - Multi-modal Diffusion Transformer:
Leverages a transformer-based diffusion model to capture complex relationships between different cardiovascular signals within a unified latent space for flexible generation. - Specialized Architecture:
Employs modality-specific encoders and decoders to handle distinct signal types and uses task-specific attention masks to precisely control the information flow between modalities for each specific task. - Continual Learning Paradigm:
Introduces a training approach that incorporates tasks with an increasing number of conditional signals in phases, effectively overcoming catastrophic forgetting and balancing complex multi-modal relationships.
If you encounter issues or wish to discuss collaborations, please contact Yuyang Miao([email protected]).
