Skip to content

Latest commit

 

History

History
137 lines (123 loc) · 4.58 KB

README.md

File metadata and controls

137 lines (123 loc) · 4.58 KB

stylegan3-encoder

Introduction

Encoder implementation for image inversion task of stylegan3 generator (Alias Free GAN).
The neural network architecture and hyper-parameter settings of the base configuration is almost the same as that of pixel2style2pixel, and various settings of improved encoder architecture will be added in the future.
For fast training, pytorch DistibutedDataParallel is used.

Installation

GPU and NVIDIA driver info

  • GeForce RTX 3090 x 8
  • NVIDIA driver version: 460.91.03

Docker build

$ sh build_img.sh
$ sh build_container.sh [container-name]

Install package

$ docker start [container-name]
$ docker attach [container-name]
$ pip install -v -e .

Pretrained weights

tree

Prepare dataset

tree2

Train

python train.py \
    --outdir exp/[exp_name] \
    --encoder [encoder_type] \
    --data data/[dataset_name] \
    --gpus [num_gpus] \
    --batch [total_batch_size] \
    --generator [generator_pkl]

Test

python test.py \
    --testdir exp/[train_exp]/[train_exp_subdir] \
    --data data/[dataset_name] \
    --gpus [num_gpus] \
    --batch [total_batch_size]

Experiments

Base configuration

Train options

{
  "model_architecture": "base",
  "dataset_dir": "data/ffhq",
  "num_gpus": 8,
  "batch_size": 32,
  "batch_gpu": 4,
  "generator_pkl": "pretrained/stylegan3-t-ffhq-1024x1024.pkl",
  "val_dataset_dir": null,
  "training_steps": 100001,
  "val_steps": 10000,
  "print_steps": 50,
  "tensorboard_steps": 50,
  "image_snapshot_steps": 100,
  "network_snapshot_steps": 5000,
  "learning_rate": 0.001,
  "l2_lambda": 1.0,
  "lpips_lambda": 0.8,
  "id_lambda": 0.1,
  "reg_lambda": 0.0,
  "gan_lambda": 0.0,
  "edit_lambda": 0.0,
  "random_seed": 0,
  "num_workers": 3,
  "resume_pkl": null,
  "run_dir": "exp/base/00000-base-ffhq-gpus8-batch32"
}

Learning Curve l2loss lpipsloss idloss idimprove

Trainset examples
Real image batch X real1 real2 real3 Encoded image batch G.synthesis(E(X)) encoded1 encoded2 encoded3

Testset examples(celeba-hq)
Target image target Encoded image encoded Encoded image, transform x=0.2, y=0 x02y00 Encoded image, transform x=0.2, y=0.1 x02y01 Encoded image, transform x=-0.2, y=0.1 x-02y01 Encoded image, transform x=-0.2, y=-0.1 x-02y-01

TODO

  • Refactoring configuration system
  • Implement resume checkpoint
  • Apply Transformer encoder instead of convs in GradualStyleBlock(Config-a) -> CNN GradualStyleBlock is better than transformer // discarded
  • Taining delta w from avg latent w_avg (G.mapping.w_avg) -> Training Config-b now, not use regularization loss, same as psp paper
  • Implement scripts for test dataset
  • Add L2 delta-regularization loss and GAN loss(latent discriminator), e4e
  • GPU memory optimization in training loop
  • Colab demo
  • Apply hyperstyle
  • Train encoder for stylegan3-r generator

References

  1. stylegan3
  2. pixel2style2pixel
  3. e4e