stylegan3-encoder

Introduction

Encoder implementation for image inversion task of stylegan3 generator (Alias Free GAN).
The neural network architecture and hyper-parameter settings of the base configuration is almost the same as that of pixel2style2pixel, and various settings of improved encoder architecture will be added in the future.
For fast training, pytorch DistibutedDataParallel is used.

Installation

GPU and NVIDIA driver info

GeForce RTX 3090 x 8
NVIDIA driver version: 460.91.03

Docker build

$ sh build_img.sh
$ sh build_container.sh [container-name]

Install package

$ docker start [container-name]
$ docker attach [container-name]
$ pip install -v -e .

Pretrained weights

encoder pretrained, base configuration
stylegan3, vgg, inception
dlib landmarks detector
IR-SE50

Prepare dataset

ffhq
ffhqs - 1000 images sampled from FFHQ, for test
celeba-hq
celeba-hq-samples

Train

python train.py \
    --outdir exp/[exp_name] \
    --encoder [encoder_type] \
    --data data/[dataset_name] \
    --gpus [num_gpus] \
    --batch [total_batch_size] \
    --generator [generator_pkl]

Test

python test.py \
    --testdir exp/[train_exp]/[train_exp_subdir] \
    --data data/[dataset_name] \
    --gpus [num_gpus] \
    --batch [total_batch_size]

Experiments

Base configuration

Train options

{
  "model_architecture": "base",
  "dataset_dir": "data/ffhq",
  "num_gpus": 8,
  "batch_size": 32,
  "batch_gpu": 4,
  "generator_pkl": "pretrained/stylegan3-t-ffhq-1024x1024.pkl",
  "val_dataset_dir": null,
  "training_steps": 100001,
  "val_steps": 10000,
  "print_steps": 50,
  "tensorboard_steps": 50,
  "image_snapshot_steps": 100,
  "network_snapshot_steps": 5000,
  "learning_rate": 0.001,
  "l2_lambda": 1.0,
  "lpips_lambda": 0.8,
  "id_lambda": 0.1,
  "reg_lambda": 0.0,
  "gan_lambda": 0.0,
  "edit_lambda": 0.0,
  "random_seed": 0,
  "num_workers": 3,
  "resume_pkl": null,
  "run_dir": "exp/base/00000-base-ffhq-gpus8-batch32"
}

Learning Curve

Trainset examples
Real image batch X Encoded image batch G.synthesis(E(X))

Testset examples(celeba-hq)
Target image Encoded image Encoded image, transform x=0.2, y=0 Encoded image, transform x=0.2, y=0.1 Encoded image, transform x=-0.2, y=0.1 Encoded image, transform x=-0.2, y=-0.1

TODO

Refactoring configuration system
Implement resume checkpoint
Apply Transformer encoder instead of convs in GradualStyleBlock(Config-a) -> CNN GradualStyleBlock is better than transformer // discarded
Taining delta w from avg latent w_avg (G.mapping.w_avg) -> Training Config-b now, not use regularization loss, same as psp paper
Implement scripts for test dataset
Add L2 delta-regularization loss and GAN loss(latent discriminator), e4e
GPU memory optimization in training loop
Colab demo
Apply hyperstyle
Train encoder for stylegan3-r generator

References

stylegan3
pixel2style2pixel
e4e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

stylegan3-encoder

Introduction

Installation

GPU and NVIDIA driver info

Docker build

Install package

Pretrained weights

Prepare dataset

Train

Test

Experiments

Base configuration

TODO

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

stylegan3-encoder

Introduction

Installation

GPU and NVIDIA driver info

Docker build

Install package

Pretrained weights

Prepare dataset

Train

Test

Experiments

Base configuration

TODO

References