-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
0e3c3ee
commit 27c45eb
Showing
34 changed files
with
1,179 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,182 @@ | ||
# pix2pix-tensorflow | ||
Tensorflow Port of Image-to-image translation using conditional adversarial nets https://phillipi.github.io/pix2pix/ | ||
|
||
Based on [pix2pix](https://phillipi.github.io/pix2pix/) by Isola et al. | ||
|
||
[Article about this implemention](https://affinelayer.com/pix2pix/) | ||
|
||
Tensorflow implementation of pix2pix. Learns a mapping from input images to output images, like these examples from the original paper: | ||
|
||
<img src="docs/examples.jpg" width="900px"/> | ||
|
||
This port is based directly on the torch implementation, and not on an existing Tensorflow implementation. It is meant to be a faithful implementation of the original work and so does not add anything. The processing speed on a GPU with cuDNN was equivalent to the Torch implementation in testing. | ||
|
||
## Setup | ||
|
||
### Prerequisites | ||
- Tensorflow 0.12.1 | ||
|
||
### Recommended | ||
- Linux with Tensorflow GPU edition + cuDNN | ||
|
||
### Getting Started | ||
|
||
```sh | ||
# Clone this repo | ||
git clone https://github.com/affinelayer/pix2pix-tensorflow.git | ||
cd pix2pix-tensorflow | ||
# Download the CMP Facades dataset http://cmp.felk.cvut.cz/~tylecr1/facade/ | ||
python tools/download-dataset.py facades | ||
# Train the model (this may take 1-8 hours depending on GPU, on CPU you will be waiting for a bit) | ||
python pix2pix.py --mode train --output_dir facades_train --max_epochs 200 --input_dir facades/train --which_direction BtoA | ||
# Test the model | ||
python pix2pix.py --mode test --output_dir facades_test --input_dir facades/val --checkpoint facades_train | ||
``` | ||
|
||
The test run will output an HTML file at `facades_test/index.html` that shows input/output/target image sets. | ||
|
||
## Datasets | ||
|
||
The data format used by this program is the same as the original pix2pix format, which consists of images of input and desired output side by side like: | ||
|
||
<img src="docs/ab.png" width="256px"/> | ||
|
||
For example: | ||
|
||
<img src="docs/418.png" width="256px"/> | ||
|
||
Some datasets have been made available by the authors of the pix2pix paper. To download those datasets, use the included script `tools/download-dataset.py`. | ||
|
||
| dataset | image | | ||
| --- | --- | | ||
| `python tools/download-dataset.py facades` <br> 400 images from [CMP Facades dataset](http://cmp.felk.cvut.cz/~tylecr1/facade/). (31MB) | <img src="docs/facades.jpg" width="256px"/> | | ||
| `python tools/download-dataset.py cityscapes` <br> 2975 images from the [Cityscapes training set](https://www.cityscapes-dataset.com/). (113M) | <img src="docs/cityscapes.jpg" width="256px"/> | | ||
| `python tools/download-dataset.py maps` <br> 1096 training images scraped from Google Maps (246M) | <img src="docs/maps.jpg" width="256px"/> | | ||
| `python tools/download-dataset.py edges2shoes` <br> 50k training images from [UT Zappos50K dataset](http://vision.cs.utexas.edu/projects/finegrained/utzap50k/). Edges are computed by [HED](https://github.com/s9xie/hed) edge detector + post-processing. (2.2GB) | <img src="docs/edges2shoes.jpg" width="256px"/> | | ||
| `python tools/download-dataset.py edges2handbags` <br> 137K Amazon Handbag images from [iGAN project](https://github.com/junyanz/iGAN). Edges are computed by [HED](https://github.com/s9xie/hed) edge detector + post-processing. (8.6GB) | <img src="docs/edges2handbags.jpg" width="256px"/> | | ||
|
||
The `facades` dataset is the smallest and easiest to get started with. | ||
|
||
### Creating your own dataset | ||
|
||
#### Example: creating images with blank centers for [inpainting](https://people.eecs.berkeley.edu/~pathak/context_encoder/) | ||
|
||
<img src="docs/combine.png" width="900px"/> | ||
|
||
```sh | ||
# Resize source images | ||
python tools/process.py --input_dir photos/original --operation resize --output_dir photos/resized | ||
# Create images with blank centers | ||
python tools/process.py --input_dir photos/resized --operation blank --output_dir photos/blank | ||
# Combine resized images with blanked images | ||
python tools/process.py --input_dir photos/resized --b_dir photos/blank --operation combine --output_dir photos/combined | ||
# Split into train/val set | ||
python tools/split.py --dir photos/combined | ||
``` | ||
|
||
The folder `photos/combined` will now have `train` and `val` subfolders that you can use for training and testing. | ||
|
||
#### Creating image pairs from existing images | ||
|
||
If you have two directories `a` and `b`, with corresponding images (same name, same dimensions, different data) you can combine them with `process.py`: | ||
|
||
```sh | ||
python tools/process.py --input_dir a --b_dir b --operation combine --output_dir c | ||
``` | ||
|
||
This puts the images in a side-by-side combined image that `pix2pix.py` expects. | ||
|
||
#### Colorization | ||
|
||
For colorization, your images should ideally all be the same aspect ratio. You can resize and crop them with the resize command: | ||
```sh | ||
python tools/process.py --input_dir photos/original --operation resize --output_dir photos/resized | ||
``` | ||
|
||
No other processing is required, the colorzation mode (see Training section below) uses single images instead of image pairs. | ||
|
||
## Training | ||
|
||
### Image Pairs | ||
|
||
For normal training with image pairs, you need to specify which directory contains the training images, and which direction to train on. The direction options are `AtoB` or `BtoA` | ||
```sh | ||
python pix2pix.py --mode train --output_dir facades_train --max_epochs 200 --input_dir facades/train --which_direction BtoA | ||
``` | ||
|
||
### Colorization | ||
|
||
`pix2pix.py` includes special code to handle colorization with single images instead of pairs, using that looks like this: | ||
|
||
```sh | ||
python pix2pix.py --mode train --output_dir photos_train --max_epochs 200 --input_dir photos/train --lab_colorization | ||
``` | ||
|
||
In this mode, image A is the black and white image (lightness only), and image B contains the color channels of that image (no lightness information). | ||
|
||
### Tips | ||
|
||
You can look at the loss and computation graph using tensorboard: | ||
```sh | ||
tensorboard --logdir=facades_train | ||
``` | ||
|
||
<img src="docs/tensorboard-scalar.png" width="250px"/> <img src="docs/tensorboard-image.png" width="250px"/> <img src="docs/tensorboard-graph.png" width="250px"/> | ||
|
||
If you wish to write in-progress pictures as the network is training, use `--display_freq 50`. This will update `facades_train/index.html` every 50 steps with the current training inputs and outputs. | ||
|
||
## Testing | ||
|
||
Testing is done with `--mode test`. You should specify the checkpoint to use with `--checkpoint`, this should point to the `output_dir` that you created previously with `--mode train`: | ||
|
||
```sh | ||
python pix2pix.py --mode test --output_dir facades_test --input_dir facades/val --checkpoint facades_train | ||
``` | ||
|
||
The testing mode will load some of the configuration options from the checkpoint provided so you do not need to specify `which_direction` for instance. | ||
|
||
The test run will output an HTML file at `facades_test/index.html` that shows input/output/target image sets: | ||
|
||
<img src="docs/test-html.png" width="300px"/> | ||
|
||
## Implementation Validation | ||
|
||
Validation of the code was performed on a Linux machine with a ~1.3 TFLOPS Nvidia GTX 750 Ti GPU. Due to a lack of compute power, validation is not extensive and only the `facades` dataset at 200 epochs was tested. | ||
|
||
```sh | ||
git clone https://github.com/affinelayer/pix2pix-tensorflow.git | ||
cd pix2pix-tensorflow | ||
python tools/download-dataset.py facades | ||
time nvidia-docker run --volume $PWD:/prj --workdir /prj --env PYTHONUNBUFFERED=x affinelayer/tensorflow:pix2pix python pix2pix.py --mode train --output_dir facades_train --max_epochs 200 --input_dir facades/train --which_direction BtoA | ||
nvidia-docker run --volume $PWD:/prj --workdir /prj --env PYTHONUNBUFFERED=x affinelayer/tensorflow:pix2pix python pix2pix.py --mode test --output_dir facades_test --input_dir facades/val --checkpoint facades_train | ||
``` | ||
|
||
Comparison on facades dataset: | ||
|
||
| Input | Tensorflow | Torch | Target | | ||
| --- | --- | --- | --- | | ||
| <img src="docs/1-inputs.png" width="256px"> | <img src="docs/1-tensorflow.png" width="256px"> | <img src="docs/1-torch.jpg" width="256px"> | <img src="docs/1-targets.png" width="256px"> | | ||
| <img src="docs/5-inputs.png" width="256px"> | <img src="docs/5-tensorflow.png" width="256px"> | <img src="docs/5-torch.jpg" width="256px"> | <img src="docs/5-targets.png" width="256px"> | | ||
| <img src="docs/51-inputs.png" width="256px"> | <img src="docs/51-tensorflow.png" width="256px"> | <img src="docs/51-torch.jpg" width="256px"> | <img src="docs/51-targets.png" width="256px"> | | ||
| <img src="docs/95-inputs.png" width="256px"> | <img src="docs/95-tensorflow.png" width="256px"> | <img src="docs/95-torch.jpg" width="256px"> | <img src="docs/95-targets.png" width="256px"> | | ||
|
||
## Unimplemented Features | ||
|
||
The following models have not been implemented: | ||
- defineG_encoder_decoder | ||
- defineG_unet_128 | ||
- defineD_pixelGAN | ||
|
||
## Citation | ||
If you use this code for your research, please cite the paper this code is based on: <a href="https://arxiv.org/pdf/1611.07004v1.pdf">Image-to-Image Translation Using Conditional Adversarial Networks</a>: | ||
|
||
``` | ||
@article{pix2pix2016, | ||
title={Image-to-Image Translation with Conditional Adversarial Networks}, | ||
author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A}, | ||
journal={arxiv}, | ||
year={2016} | ||
} | ||
``` | ||
|
||
## Acknowledgments | ||
This is a port of [pix2pix](https://github.com/phillipi/pix2pix) from Torch to Tensorflow. It also contains colorspace conversion code ported from Torch. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.
Oops, something went wrong.