This is our PyTorch implementation for the paper:
Tinglin Huang, Syed Asad Rizvi, Rohan Krishna Thakur, Vimig Socrates, Meili Gupta, David van Dijk, R. Andrew Taylor, Rex Ying (2024). HEART: Learning better representation of EHR data with a heterogeneous relation-aware transformer. Paper Link. In Journal of Biomedical Informatics 159 (2024): 104741.
We have provided the preprocessing scripts in dataset/
for MIMIC-III and eICU datasets respectively. Please first download the datasets from the following link:
- MIMIC-III: https://mimic.mit.edu/docs/iii/
- eICU: https://eicu-crd.mit.edu/about/eicu/
The code has been tested running under Python 3.10.14. The required packages are as follows:
- pytorch == 2.3.0
- torch_geometric == 2.5.3
- einops == 0.8.0
Once you finished these installation, please run install the package by running:
pip install -e .
The code is organized as follows:
app/
: the main code for training and testing the modelfinetune.py
: the pipeline for finetuning the model on downstream taskspretrain.py
: the pipeline for pretraining the model on the pretraining task
dataset/
: the code for data processingeICU.ipynb
: dataset preprocessing for eICUMIMIC-III.ipynb
: dataset preprocessing for MIMIC-III
models/
gnn.py
: implementation of the graph attention for the encounter-level attentionHEART.py
: implementation of the pretraining and finetuning modeltransformer_rel.py
: implementation of the transformer with heterogeneous relationstransformer.py
: implementation of the transformer
utils/
: utility functions including data loading pipeline