Skip to content

ML-and-AI-repo/stylegan3-encoder

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

stylegan3-encoder

Introduction

Encoder implementation for image inversion task of stylegan3 generator (Alias Free GAN).
The neural network architecture and hyper-parameter settings of the base configuration is almost the same as that of pixel2style2pixel, and various settings of improved encoder architecture will be added in the future.
For fast training, pytorch DistibutedDataParallel is used.

Installation

GPU and NVIDIA driver info

  • GeForce RTX 3090 x 8
  • NVIDIA driver version: 460.91.03

Docker build

$ sh build_img.sh
$ sh build_container.sh [container-name]

Install package

$ docker start [container-name]
$ docker attach [container-name]
$ pip install -v -e .

Pretrained weights

tree

Prepare dataset

tree2

Train

python train.py \
    --outdir exp/[exp_name] \
    --encoder [encoder_type] \
    --data data/[dataset_name] \
    --gpus [num_gpus] \
    --batch [total_batch_size] \
    --generator [generator_pkl]

Test

python test.py \
    --testdir exp/[train_exp]/[train_exp_subdir] \
    --data data/[dataset_name] \
    --gpus [num_gpus] \
    --batch [total_batch_size]

Experiments

Base configuration

Train options

{
  "model_architecture": "base",
  "dataset_dir": "data/ffhq",
  "num_gpus": 8,
  "batch_size": 32,
  "batch_gpu": 4,
  "generator_pkl": "pretrained/stylegan3-t-ffhq-1024x1024.pkl",
  "val_dataset_dir": null,
  "training_steps": 100001,
  "val_steps": 10000,
  "print_steps": 50,
  "tensorboard_steps": 50,
  "image_snapshot_steps": 100,
  "network_snapshot_steps": 5000,
  "learning_rate": 0.001,
  "l2_lambda": 1.0,
  "lpips_lambda": 0.8,
  "id_lambda": 0.1,
  "reg_lambda": 0.0,
  "gan_lambda": 0.0,
  "edit_lambda": 0.0,
  "random_seed": 0,
  "num_workers": 3,
  "resume_pkl": null,
  "run_dir": "exp/base/00000-base-ffhq-gpus8-batch32"
}

Learning Curve l2loss lpipsloss idloss idimprove

Trainset examples
Real image batch X real1 real2 real3 Encoded image batch G.synthesis(E(X)) encoded1 encoded2 encoded3

Testset examples(celeba-hq)
Target image target Encoded image encoded Encoded image, transform x=0.2, y=0 x02y00 Encoded image, transform x=0.2, y=0.1 x02y01 Encoded image, transform x=-0.2, y=0.1 x-02y01 Encoded image, transform x=-0.2, y=-0.1 x-02y-01

TODO

  • Refactoring configuration system
  • Implement resume checkpoint
  • Apply Transformer encoder instead of convs in GradualStyleBlock(Config-a) -> CNN GradualStyleBlock is better than transformer // discarded
  • Taining delta w from avg latent w_avg (G.mapping.w_avg) -> Training Config-b now, not use regularization loss, same as psp paper
  • Implement scripts for test dataset
  • Add L2 delta-regularization loss and GAN loss(latent discriminator), e4e
  • GPU memory optimization in training loop
  • Colab demo
  • Apply hyperstyle
  • Train encoder for stylegan3-r generator

References

  1. stylegan3
  2. pixel2style2pixel
  3. e4e

About

stylegan3 encoder for image inversion

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 70.4%
  • Cuda 21.5%
  • C++ 6.9%
  • Other 1.2%