[NEW!] GauGAN models on Coco-stuff are released! Support evolution search, which is much faster than the previous one! Please refer to our lite pipeline!
[NEW!] New features: searching resuming and more user-friendly progress bar during training and searching! Update the docs of the code! The lite version of cityscapes for pix2pix is released!
[NEW!] The lite version of GauGAN is released, which could also produce comparable results as the full GauGAN pipeline!
[NEW!] The lite pipeline (GAN Compression Lite) is updated, which could produce comparable results as the full pipeline with much simpler procedure! The lite version of map2sat is released!
[NEW!] The lite pipeline (GAN Compression Lite) is released! Check the tutorial for the pipeline.
[NEW!] GauGAN training code and tutorial is released! Check the tutorial to compress GauGAN.
We introduce GAN Compression, a general-purpose method for compressing conditional GANs. Our method reduces the computation of widely-used conditional GAN models, including pix2pix, CycleGAN, and GauGAN, by 9-21x while preserving the visual fidelity. Our method is effective for a wide range of generator architectures, learning objectives, and both paired and unpaired settings.
GAN Compression: Efficient Architectures for Interactive Conditional GANs
Muyang Li, Ji Lin, Yaoyao Ding, Zhijian Liu, Jun-Yan Zhu, and Song Han
MIT, Adobe Research, SJTU
In CVPR 2020.
GAN Compression framework: ① Given a pre-trained teacher generator G', we distill a smaller “once-for-all” student generator G that contains all possible channel numbers through weight sharing. We choose different channel numbers for the student generator G at each training step. ② We then extract many sub-generators from the “once-for-all” generator and evaluate their performance. No retraining is needed, which is the advantage of the “once-for-all” generator. ③ Finally, we choose the best sub-generator given the compression ratio target and performance target (FID or mIoU), perform fine-tuning, and obtain the final compressed model.
GAN Compression reduces the computation of pix2pix, cycleGAN and GauGAN by 9-21x, and model size by 4.6-33x.
PyTorch Colab notebook: CycleGAN and pix2pix.
- Linux
- Python 3
- CPU or NVIDIA GPU + CUDA CuDNN
-
Clone this repo:
git clone [email protected]:mit-han-lab/gan-compression.git cd gan-compression
-
Install PyTorch 1.4 and other dependencies (e.g., torchvision).
- For pip users, please type the command
pip install -r requirements.txt
. - For Conda users, we provide an installation script
scripts/conda_deps.sh
. Alternatively, you can create a new Conda environment usingconda env create -f environment.yml
.
- For pip users, please type the command
-
Download the CycleGAN dataset (e.g., horse2zebra).
bash datasets/download_cyclegan_dataset.sh horse2zebra
-
Get the statistical information for the ground-truth images for your dataset to compute FID. We provide pre-prepared real statistic information for several datasets. For example,
bash datasets/download_real_stat.sh horse2zebra A bash datasets/download_real_stat.sh horse2zebra B
-
Download the pre-trained models.
python scripts/download_model.py --model cycle_gan --task horse2zebra --stage full python scripts/download_model.py --model cycle_gan --task horse2zebra --stage compressed
-
Test the original full model.
bash scripts/cycle_gan/horse2zebra/test_full.sh
-
Test the compressed model.
bash scripts/cycle_gan/horse2zebra/test_compressed.sh
-
Measure the latency of the two models.
bash scripts/cycle_gan/horse2zebra/latency_full.sh bash scripts/cycle_gan/horse2zebra/latency_compressed.sh
-
There may be a little differences between the results of above models and those of the paper since we retrained the models. We also release the compressed models of the paper. If there are such inconsistencies, you could try the following commands to test our paper models:
python scripts/download_model.py --model cycle_gan --task horse2zebra --stage legacy bash scripts/cycle_gan/horse2zebra/test_legacy.sh bash scripts/cycle_gan/horse2zebra/latency_legacy.sh
-
Download the pix2pix dataset (e.g., edges2shoes).
bash datasets/download_pix2pix_dataset.sh edges2shoes-r
-
Get the statistical information for the ground-truth images for your dataset to compute FID. We provide pre-prepared real statistics for several datasets. For example,
bash datasets/download_real_stat.sh edges2shoes-r B
-
Download the pre-trained models.
python scripts/download_model.py --model pix2pix --task edges2shoes-r --stage full python scripts/download_model.py --model pix2pix --task edges2shoes-r --stage compressed
-
Test the original full model.
bash scripts/pix2pix/edges2shoes-r/test_full.sh
-
Test the compressed model.
bash scripts/pix2pix/edges2shoes-r/test_compressed.sh
-
Measure the latency of the two models.
bash scripts/pix2pix/edges2shoes-r/latency_full.sh bash scripts/pix2pix/edges2shoes-r/latency_compressed.sh
-
There may be a little differences between the results of above models and those of the paper since we retrained the models. We also release the compressed models of the paper. If there are such inconsistencies, you could try the following commands to test our paper models:
python scripts/download_model.py --model pix2pix --task edges2shoes-r --stage legacy bash scripts/pix2pix/edges2shoes-r/test_legacy.sh bash scripts/pix2pix/edges2shoes-r/latency_legacy.sh
-
Prepare the cityscapes dataset. Check here for preparing the cityscapes dataset.
-
Get the statistical information for the ground-truth images for your dataset to compute FID. We provide pre-prepared real statistics for several datasets. For example,
bash datasets/download_real_stat.sh cityscapes A
-
Download the pre-trained models.
python scripts/download_model.py --model gaugan --task cityscapes --stage full python scripts/download_model.py --model gaugan --task cityscapes --stage compressed
-
Test the original full model.
bash scripts/gaugan/cityscapes/test_full.sh
-
Test the compressed model.
bash scripts/gaugan/cityscapes/test_compressed.sh
-
Measure the latency of the two models.
bash scripts/gaugan/cityscapes/latency_full.sh bash scripts/gaugan/cityscapes/latency_compressed.sh
-
There may be a little differences between the results of above models and those of the paper since we retrained the models. We also release the compressed models of the paper. If there are such inconsistencies, you could try the following commands to test our paper models:
python scripts/download_model.py --model gaugan --task cityscapes --stage legacy bash scripts/gaugan/cityscapes/test_legacy.sh bash scripts/gaugan/cityscapes/latency_legacy.sh
For the Cityscapes dataset, we cannot provide it due to license issue. Please download the dataset from https://cityscapes-dataset.com and use the script prepare_cityscapes_dataset.py to preprocess it. You need to download gtFine_trainvaltest.zip
and leftImg8bit_trainvaltest.zip
and unzip them in the same folder. For example, you may put gtFine
and leftImg8bit
in database/cityscapes-origin
. You need to prepare the dataset with the following commands:
python datasets/get_trainIds.py database/cityscapes-origin/gtFine/
python datasets/prepare_cityscapes_dataset.py \
--gtFine_dir database/cityscapes-origin/gtFine \
--leftImg8bit_dir database/cityscapes-origin/leftImg8bit \
--output_dir database/cityscapes \
--table_path datasets/table.txt
You will get a preprocessed dataset in database/cityscapes
and a mapping table (used to compute mIoU) in dataset/table.txt
.
To support mIoU computation, you need to download a pre-trained DRN model drn-d-105_ms_cityscapes.pth
from http://go.yf.io/drn-cityscapes-models. By default, we put the drn model in the root directory of the repo. Then you can test our compressed models on cityscapes after you have downloaded our models.
We follow the same COCO-Stuff dataset preparation as NVlabs/spade. Specifically, you need to download train2017.zip
, val2017.zip
, stuffthingmaps_trainval2017.zip
, and annotations_trainval2017.zip
from nightrome/cocostuff. The images, labels, and instance maps should be arranged in the same directory structure as in datasets/coco_stuff. In particular, we used an instance map that combines both the boundaries of "things instance map" and "stuff label map". To do this, we used a simple script datasets/coco_generate_instance_map.py.
To support mIoU computation, you need to download a pre-trained DeeplabV2 model deeplabv2_resnet101_msc-cocostuff164k-100000.pth and also put it in the root directory of the repo.
Here we show the performance of all our released models:
Model | Dataset | Method | #Parameters | MACs | Metric | |
---|---|---|---|---|---|---|
FID | mIoU | |||||
CycleGAN | horse→zebra | Original | 11.4M | 56.8G | 65.75 | -- |
Full Pipeline (Paper) | 0.342M | 2.67G | 65.33 | -- | ||
Full Pipeline (Retrained) | 0.357M | 2.55G | 65.12 | -- | ||
Lite Pipeline | 0.355M | 2.64G | 65.19 | -- | ||
Pix2pix | edges→shoes | Original | 11.4M | 56.8G | 24.12 | -- |
Full Pipeline (Paper) | 0.700M | 4.81G | 26.60 | -- | ||
Full Pipeline (Retrained) | 0.822M | 4.99G | 26.70 | -- | ||
Lite Pipeline | 0.756M | 4.61G | 25.26 | -- | ||
Cityscapes | Original | 11.4M | 56.8G | 61.50 | 42.06 | |
Full Pipeline (Paper) | 0.707M | 5.66G | 72.24 | 40.77 | ||
Full Pipeline (Retrained) | 0.781M | 5.59G | 73.45 | 38.63 | ||
Lite Pipeline | 0.867M | 5.61G | 65.23 | 39.09 | ||
map→arial photo |
Original | 11.4M | 56.8G | 47.91 | -- | |
Full Pipeline | 0.746M | 4.68G | 48.02 | -- | ||
Lite Pipeline | 0.708M | 4.53G | 48.38 | -- | ||
GauGAN | Cityscapes | Original | 93.0M | 281G | 57.60 | 61.04 |
Full Pipeline (Paper) | 20.4M | 31.7G | 55.19 | 61.22 | ||
Full Pipeline (Retrained) | 21.0M | 31.2G | 56.43 | 60.29 | ||
Lite Pipeline | 20.2M | 31.3G | 56.25 | 61.17 | ||
Coco-Stuff | Original | 97.5M | 191G | 21.38 | 38.78 | |
Lite Pipeline | 26.0M | 35.5G | 25.06 | 35.05 |
Please refer to the lite_pipeline.md and full_pipeline.md on how to train models on our datasets and your own.
To compute the FID score, you need to get some statistical information from the groud-truth images of your dataset. We provide a script get_real_stat.py to extract statistical information. For example, for the edges2shoes dataset, you could run the following command:
python get_real_stat.py \
--dataroot database/edges2shoes-r \
--output_path real_stat/edges2shoes-r_B.npz \
--direction AtoB
For paired image-to-image translation (pix2pix and GauGAN), we calculate the FID between generated test images to real test images. For unpaired image-to-image translation (CycleGAN), we calculate the FID between generated test images to real training+test images. This allows us to use more images for a stable FID evaluation, as done in previous unconditional GANs research. The difference of the two protocols is small. The FID of our compressed CycleGAN model increases by 4 when using real test images instead of real training+test images.
To help users better understand and use our code, we briefly overview the functionality and implementation of each package and each module.
If you use this code for your research, please cite our paper.
@inproceedings{li2020gan,
title={GAN Compression: Efficient Architectures for Interactive Conditional GANs},
author={Li, Muyang and Lin, Ji and Ding, Yaoyao and Liu, Zhijian and Zhu, Jun-Yan and Han, Song},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2020}
}
Our code is developed based on pytorch-CycleGAN-and-pix2pix and SPADE.
We also thank pytorch-fid for FID computation, drn for cityscapes mIoU computation and deeplabv2 for Coco-Stuff mIoU computation.