Reproduce the following GAN-related methods, 100~200 lines each:
-
pix2pix (Image-to-image Translation with Conditional Adversarial Networks)
-
InfoGAN (InfoGAN: Interpretable Representation Learning by Information Maximizing GAN)
-
Conditional GAN
-
Improved Wasserstein GAN, i.e. WGAN-GP (Improved Training of Wasserstein GANs)
-
DiscoGAN (Learning to Discover Cross-Domain Relations with Generative Adversarial Networks)
-
BEGAN (BEGAN: Boundary Equilibrium Generative Adversarial Networks)
-
CycleGAN (Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks)
Please see the docstring in each script for detailed usage and pretrained models. MultiGPU training is supported.
Reproduce DCGAN following the setup in dcgan.torch.
- Generated samples
- Vector arithmetic: smiling woman - neutral woman + neutral man = smiling man
Image-to-Image translation following the setup in pix2pix.
For example, with the cityscapes dataset, it learns to generate semantic segmentation map of urban scene:
This is a visualization from tensorboard. Left to right: original, ground truth, model output.
Reproduce the mnist experiement in InfoGAN. It assumes 10 latent variables corresponding to a categorical distribution, 2 latent variables corresponding to a uniform distribution. It then maximizes mutual information between these latent variables and the image, and learns interpretable latent representation.
- Left: 10 latent variables corresponding to 10 digits.
- Middle: 1 continuous latent variable controlled the rotation.
- Right: another continuous latent variable controlled the thickness.
Train a simple GAN on mnist, conditioned on the class labels.
These variants are implemented by some small modifications on top of DCGAN.py. BEGAN has the best visual quality among them. Some BEGAN samples:
Reproduce CycleGAN with the original datasets, and DiscoGAN on CelebA. They are pretty much the same idea with different architecture. CycleGAN horse-to-zebra in tensorboard: