Skip to content

yanre-hyd/HiFiSketch

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HiFiSketch

We provide PyTorch implementations for our TIP2023 paper HifiSketch: High Fidelity Face Photo-Sketch Synthesis and Manipulation

This project can generate face sketch from photos and edit the sketch through text. Paper@IEEE Code@Github

Prerequisites

  • Linux
  • Python 3.7
  • NVIDIA GPU + CUDA + CuDNN

Getting Started

Installation

  • Clone this repo:

    git clone https://github.com/shenhaiyoualn/HiFiSketch
    cd HiFiSketch
    

*The environment file is defined in environment/environment.yaml, Install all the dependencies by:

conda env create -f environment.yml

Prepare

  • Download your dataset and put it in the dataset directory, then update configs/path_config.py like:

    dataset_paths = {
      'CUHK_train_P': '/media/gpu/T7/HifiSketch/datasets/CUHK_train_Photo',
      'CUHK_train_S': '/media/gpu/T7/HifiSketch/datasets/CUHK_train_Sketch',
      'CUHK_test_P': '/media/gpu/T7/HifiSketch/datasets/CUHK_test_Photo',
      'CUHK_test_S': '/media/gpu/T7/HifiSketch/datasets/CUHK_test_Sketch',}
    
  • Update configs/data_conf.py like:

    DATASETS = {
      'CUHK': {
      	'transforms': trans_conf.EncodeTransforms,
      	'train_source_root': dataset_paths['CUHK_train_P'],
      	'train_target_root': dataset_paths['CUHK_train_S'],
      	'test_source_root': dataset_paths['CUHK_test_P'],
      	'test_target_root': dataset_paths['CUHK_test_S'],
      },}
    

Our model uses a lot of pre-trained models, you can find them below:

Path Description
FFHQ StyleGAN pretrained StyleGAN2 model with 1024x1024 resolution.
Faces W-Encoder Pretrained e4e encoder.
IR-SE50 Model Pretrained IR-SE50 model taken from TreB1eN
ResNet-34 Model ResNet-34 model trained on ImageNet taken from torchvision.
MTCNN Weights for MTCNN model taken from TreB1eN for use in ID similarity.

Please note that the generator we use is derived from rosinality‘s code.

Training and Inference

  • Train a model
CUDA_VISIBLE_DEVICES="0" python scripts/train.py \
--dataset_type=CUHK \
--encoder_type=hifinet \
--exp=experiments/CUHK \
--workers=1 \
--batch_size=4 \
--test_batch_size=2 \
--test_workers=1 \
--val_interval=5000 \
--save_interval=5000 \
--n_iters_per_batch=1 \
--max_val_batches=150 \
--output_size=1024 \
--load_w_encoder

You can modify the training parameters in the options/train_options.py file.

  • inference the model
CUDA_VISIBLE_DEVICES="0" python scripts/inference.py \
--exp=experiments/CUHK \
--checkpoint_path=/model/path \
--data_path=/your/test/data/path \
--test_batch_size=4 \
--test_workers=4 \
--n_iters_per_batch=1 \
--load_w_encoder \
--w_encoder_checkpoint_path pretrained_models/faces_encoder.pt 

You can use --save_weight_deltas to save the final weight and adjust the --n_iters_per_batch parameters to get more realistic effects. You can find all the test parameters in the options/test_options.py file.

Editing

  • You can edit the generated image through text by:
python editing/edit/edit.py \
--exp /your/experiment/dir \
--weight_deltas_path /your/weight_deltas \
--neutral_text "a face" \
--target_tex "a face with glasses"

bibtex:

@article{peng2023hifisketch,
  title={HiFiSketch: High Fidelity Face Photo-Sketch Synthesis and Manipulation},
  author={Peng, Chunlei and Zhang, Congyu and Liu, Decheng and Wang, Nannan and Gao, Xinbo},
  journal={IEEE Transactions on Image Processing},
  year={2023},
  publisher={IEEE}
}

Acknowledgments

Our code is inspired by Hyperstyle, e4e and stylegan2

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.4%
  • Cuda 4.9%
  • C++ 0.7%