This is official PyTorch implementation of NeurIPS 2018 paper Learning Hierarchical Semantic Image Manipulation through Structured Representations by Seunghoon Hong, Xinchen Yan, Thomas Huang, Honglak Lee.
Please follow the instructions to run the code.
- Mac OS X or Linux
- NVIDIA GPU (make sure your GPU has 12G+ memory) + CUDA cuDNN
- Install Pytorch
- Note: This implementation has been tested with Pytorch 0.3.1.
conda install pytorch torchvision cudatoolkit=9.0 -c pytorch
- Install TensorFlow
- Note: This implementation has been tested with TensorFlow 1.5.
pip install tensorflow-gpu==1.5
- Install Python Dominate Library
pip install dominate
-
Please run the following script that creates two folders
checkpoints/
anddatasets/
.bash setup.sh
-
Please download the Cityscapes dataset from the official website (registration required). After downloading, please put these files under the
datasets/cityscape/
folder and run the following script.python preprocess_city.py
-
Please download the ADE20K dataset from the official website. After downloading, please put these files under the
datasets/ade20k/
folder and run the following script.python preprocess_ade.py
- You can download the pre-trained box-to-layout models, please run the following scripts.
bash scripts/download_pretrained_box2mask_city.sh bash scripts/download_pretrained_box2mask_ade.sh
- Now, let us generate the manipulated layout from the pre-trained models. Please check the synthesized layouts under
checkpoints/
.bash scripts/test_pretrained_box2mask_city.sh bash scripts/test_pretrained_box2mask_ade.sh
- You can download the pre-trained layout-to-image models, please run the following scripts.
bash scripts/download_pretrained_mask2image_city.sh bash scripts/download_pretrained_mask2image_ade.sh
- Now, let us generate the manipulated image from the pre-trained models. Please check the synthesized images under
checkpoints/
.bash scripts/test_pretrained_mask2image_city.sh bash scripts/test_pretrained_mask2image_ade.sh
- We provide a script to generate image using the predicted layout. Please check the synthesized images under
results/
folder.bash scripts/test_joint_inference_city.sh
- If you want to train the box-to-layout generator on Cityscape dataset, please run the following script (usually it takes a few hours using one GPU).
bash scripts/train_box2mask_city.sh
- If you want to train the box-to-layout generator on ADE20K dataset, please run the following script (usually it takes a few hours using one GPU).
bash scripts/train_box2mask_ade.sh
- If you want to train the layout-to-image generator on Cityscape dataset, please run the following script (usually it takes one day using one GPU).
bash scripts/train_mask2image_city.sh
- If you want to train the layout-to-image generator on ADE20K dataset, please run the following script (usually it takes one day using one GPU).
bash scripts/train_mask2image_ade.sh
- If you have any question regarding our pytorch implementation, please feel free to submit an issue here. We will try to address your question as soon as possible.
If you find this useful, please cite our work as follows:
@inproceedings{hong2018learning,
title={Learning hierarchical semantic image manipulation through structured representations},
author={Hong, Seunghoon and Yan, Xinchen and Huang, Thomas E and Lee, Honglak},
booktitle={Advances in Neural Information Processing Systems},
pages={2713--2723},
year={2018}
}
We would like to thank the amazing developers and the open-sourcing community. Our implementation has especially been benefited from the following excellent repositories:
- Pytorch CycleGAN and Pix2Pix: https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix
- Pytorch Pix2PixHD: https://github.com/NVIDIA/pix2pixHD
- Torch ContextEncoder: https://github.com/pathak22/context-encoder