The official implementation of "ECAMP: Entity-centered Context-aware Medical Vision Language Pre-training".
Our paper can be found here
Some code is borrowed from MAE, huggingface and MRM
Clone this repository:
git clone https://github.com/ToniChopp/ECAMP.git
Install Python dependencies:
conda env create -f environment.yml
As of now, we exclusively offer pre-training code, focusing solely on illustrating the process of retrieving MIMIC-CXR data
- MIMIC-CXR: We downloaded the MIMIC-CXR-JPG dataset as the radiographs. Paired medical reports can be downloaded in MIMIC-CXR.
You can download ViTB/16 checkpoint here for pretraining.
Our pre-trained model can be found here.
New: Our distilled reports by LLM have been released. You can fetch them here
The distilled report and attention weights will be released as soon as our paper is accepted, but you can still use the original radiographs and report for pre-training.
We pre-train ECAMP on MIMIC-CXR using this command:
cd ECAMP/ECAMP/Pre-training
chmod a+x run.sh
./run.sh
Note that it is flexible to develop other pre-training models under this framework.
If you have found our work valuable for your research, we kindly suggest that you acknowledge and cite our contribution(s) by referencing:
@misc{wang2023ecamp,
title={ECAMP: Entity-centered Context-aware Medical Vision Language Pre-training},
author={Rongsheng Wang and Qingsong Yao and Haoran Lai and Zhiyang He and Xiaodong Tao and Zihang Jiang and S. Kevin Zhou},
year={2023},
eprint={2312.13316},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Hope you enjoy!