Various GAN architectures in PyTorch

PyTorch implementations of GAN architectures such as CycleGAN, WGAN-GP and BiGAN as well as simple MLP GAN and non-saturating GAN.

Models

Simple GAN for 1D dataset:

We'll train our generator and discriminator via the original minimax GAN objective:

$min_{G} max_{D}\mathbb{E}_{x \sim p_{data}} [\log D(x)] + \mathbb{E}_{z \sim p(z)}[\log (1-D(G(z)))]$

Using an MLP for both your generator and discriminator, and trained until the generated distribution resembles the target distribution.

Simple GAN with non-saturating GAN objective for 1D dataset:

Here, we'll use the non-saturating formulation of the GAN objective. Now, we have two separate losses:

$L^{(D)} = \mathbb{E}_{x\sim p_{data}} [\log D(x)] + \mathbb{E}_{z \sim p(z)}[\log (1-D(G(z)))]$

$L^{(G)} = - \mathbb{E}_{z \sim p(z)} \log(D(G(z))$

WGAN-GP for CIFAR-10:

Using the CIFAR-10 architecture from the SN-GAN paper, with $z \in \mathbb R ^{128}$ , with $z \sim \mathcal N (0, I_{128})$ . Instead of upsampling via transposed convolutions and downsampling via pooling or striding, we'll use the DepthToSpace and SpaceToDepth methods, described in the repo, for changing the spatial configuration of our hidden states.

We'll implement WGAN-GP, which uses a gradient penalty to regularize the discriminator. Using the Adam optimizer with $\alpha = 2e-4$ , $\beta_1 = 0$ , $\beta_2 = 0.9$ , $\lambda = 10$ , $n_{critic} = 5$ . A batch size of 256 and n_filters=128 within the ResBlocks were used. Trained for approximately 25000 gradient steps, with the learning rate linearly annealed to 0 over training.

BiGAN on MNIST for representation learning:

In BiGAN, in addition to training a generator $G$ and a discriminator $D$ , we train an encoder $E$ that maps from real images $x$ to latent codes $z$ . The discriminator now must learn to jointly identify fake $z$ , fake $x$ , and paired $(x, z)$ that don't belong together. In the original BiGAN paper, they prove that the optimal $E$ learns to invert the generative mapping $G: z \rightarrow x$ . Our overall minimax term is now

$V(D, E, G) = \mathbb{E}_{x \sim p_x}[\mathbb{E}_{z \sim p_E(\cdot | x)}[\log D(x, z)]] + \mathbb{E}_{z \sim p_z}[\mathbb{E}_{x \sim p_G(\cdot | z)}[\log (1 - D(x, z))]]$

Architecture:

We will closely follow the MNIST architecture outlined in the original BiGAN paper, Appendix C.1, with one modification: instead of having $z \sim \text{Uniform}[-1, 1]$ , we use $z \sim \mathcal N (0, 1)$ with $z \in \mathbb R ^{50}$ .

Hyperparameters:

We make several modifications to what is listed in the BiGAN paper. We apply $l_2$ weight decay to all weights and decay the step size $\alpha$ linearly to 0 over the course of training. Weights are initialized via the default PyTorch manner.

Testing the representation:

We want to see how good a linear classifier $L$ we can learn such that

$y \approx L(E(x))$

where $y$ is the appropriate label. Fix $E$ and learn a weight matrix $W$ such that your linear classifier is composed of passing $x$ through $E$ , then multiplying by $W$ , then applying a softmax nonlinearity. This is trained via gradient descent with the cross-entropy loss.

As a baseline, randomly initialize another network $E_{random}$ with the same architecture, fix its weights, and train a linear classifier on top, as done in the previous part.

CycleGAN:

In CycleGAN, the goal is to learn functions $F$ and $G$ that can transform images from $X \rightarrow Y$ and vice-versa. This is an unconstrained problem, so we additionally enforce the cycle-consistency property, where we want

$x \approx G(F(x))$

and

$y \approx F(G(x))$

This loss function encourages $F$ and $G$ to approximately invert each other. In addition to this cycle-consistency loss, we also have a standard GAN loss such that $F(x)$ and $G(y)$ look like real images from the other domain.

Datasets

Name	Dataset
1D Dataset
CIFAR-10
MNIST
Colorized MNIST

Samples and results

Model	Dataset	First epoch	Last epoch
Simple GAN	1D Dataset
Non-saturating GAN	1D Dataset

Model	Dataset	Generated samples
WGAN-GP	CIFAR-10

WGAN-GP gets an inception score of 7.28 out of 10. The real images from CIFAR-10 get 9.97 of 10.

Model	Dataset	Generated samples	Reconstructions
BiGAN	MNIST

Model	Dataset	Generated samples	Reconstructions
CycleGAN	MNIST and Colorized MNIST

For CycleGAN: To the left is a set of images showing real MNIST digits, transformations of those images into Colored MNIST digits, and reconstructions back into the greyscale domain. To the right, a set of images showing real Colored MNIST digits, transformations of those images, and reconstructions.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
images		images
utils		utils
.gitignore		.gitignore
README.md		README.md
bigan.py		bigan.py
cyclegan.py		cyclegan.py
data.zip		data.zip
gan_1d.py		gan_1d.py
non_saturating_gan_1d.py		non_saturating_gan_1d.py
wgan_gp.py		wgan_gp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Various GAN architectures in PyTorch

Models

Datasets

Samples and results

About

Uh oh!

Releases

Packages

Languages

henrhoi/gan-pytorch

Folders and files

Latest commit

History

Repository files navigation

Various GAN architectures in PyTorch

Models

Datasets

Samples and results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages