Text-DiFuse

This is the official code of the NeurIPS 2024 paper "Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion Model"

Environmental Installation

conda create -n Text-DiFuse python==3.9.0

conda activate Text-DiFuse

Install the appropriate torch, we recommend the CUDA Version 11.8 environment torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0

pip install -r requirements.txt

Pre-trained models and data

Pre-trained diffusion model parameters and Fusion Control Module (FCM) parameters can be downloaded from https://drive.google.com/drive/folders/1LIcehq772Qd-3_OnaKmHWGGwkArN4MYg and placed in the "./pretained"

Download the infrared-visible light images and medical multimodal images that need to be inferred and place them in "./data/",We also provide some example pairing data, which can be tested directly

Test

Select "task_type" (#Modify parameters to achieve different fusion tasks: VIS-IR, MRI-CT, MRI-PET, MRI-SPECT#)

python test.py

Citation

@article{zhang2024text,
  title={Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion Model},
  author={Zhang, Hao and Cao, Lei and Ma, Jiayi},
  journal={arXiv preprint arXiv:2410.23905},
  year={2024}
}

If you have any questions or discussions, please send me an email at [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Text-DiFuse		Text-DiFuse
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text-DiFuse

Environmental Installation

Pre-trained models and data

Test

Citation

About

Releases

Packages

Languages

License

Leiii-Cao/Text-DiFuse

Folders and files

Latest commit

History

Repository files navigation

Text-DiFuse

Environmental Installation

Pre-trained models and data

Test

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages