Skip to content

Latest commit

 

History

History
131 lines (102 loc) · 4.43 KB

README.md

File metadata and controls

131 lines (102 loc) · 4.43 KB

HiFiSketch

We provide PyTorch implementations for our TIP2023 paper HifiSketch: High Fidelity Face Photo-Sketch Synthesis and Manipulation

This project can generate face sketch from photos and edit the sketch through text. Paper@IEEE Code@Github

Prerequisites

  • Linux
  • Python 3.7
  • NVIDIA GPU + CUDA + CuDNN

Getting Started

Installation

  • Clone this repo:

    git clone https://github.com/shenhaiyoualn/HiFiSketch
    cd HiFiSketch
    

*The environment file is defined in environment/environment.yaml, Install all the dependencies by:

conda env create -f environment.yml

Prepare

  • Download your dataset and put it in the dataset directory, then update configs/path_config.py like:

    dataset_paths = {
      'CUHK_train_P': '/media/gpu/T7/HifiSketch/datasets/CUHK_train_Photo',
      'CUHK_train_S': '/media/gpu/T7/HifiSketch/datasets/CUHK_train_Sketch',
      'CUHK_test_P': '/media/gpu/T7/HifiSketch/datasets/CUHK_test_Photo',
      'CUHK_test_S': '/media/gpu/T7/HifiSketch/datasets/CUHK_test_Sketch',}
    
  • Update configs/data_conf.py like:

    DATASETS = {
      'CUHK': {
      	'transforms': trans_conf.EncodeTransforms,
      	'train_source_root': dataset_paths['CUHK_train_P'],
      	'train_target_root': dataset_paths['CUHK_train_S'],
      	'test_source_root': dataset_paths['CUHK_test_P'],
      	'test_target_root': dataset_paths['CUHK_test_S'],
      },}
    

Our model uses a lot of pre-trained models, you can find them below:

Path Description
FFHQ StyleGAN pretrained StyleGAN2 model with 1024x1024 resolution.
Faces W-Encoder Pretrained e4e encoder.
IR-SE50 Model Pretrained IR-SE50 model taken from TreB1eN
ResNet-34 Model ResNet-34 model trained on ImageNet taken from torchvision.
MTCNN Weights for MTCNN model taken from TreB1eN for use in ID similarity.

Please note that the generator we use is derived from rosinality‘s code.

Training and Inference

  • Train a model
CUDA_VISIBLE_DEVICES="0" python scripts/train.py \
--dataset_type=CUHK \
--encoder_type=hifinet \
--exp=experiments/CUHK \
--workers=1 \
--batch_size=4 \
--test_batch_size=2 \
--test_workers=1 \
--val_interval=5000 \
--save_interval=5000 \
--n_iters_per_batch=1 \
--max_val_batches=150 \
--output_size=1024 \
--load_w_encoder

You can modify the training parameters in the options/train_options.py file.

  • inference the model
CUDA_VISIBLE_DEVICES="0" python scripts/inference.py \
--exp=experiments/CUHK \
--checkpoint_path=/model/path \
--data_path=/your/test/data/path \
--test_batch_size=4 \
--test_workers=4 \
--n_iters_per_batch=1 \
--load_w_encoder \
--w_encoder_checkpoint_path pretrained_models/faces_encoder.pt 

You can use --save_weight_deltas to save the final weight and adjust the --n_iters_per_batch parameters to get more realistic effects. You can find all the test parameters in the options/test_options.py file.

Editing

  • You can edit the generated image through text by:
python editing/edit/edit.py \
--exp /your/experiment/dir \
--weight_deltas_path /your/weight_deltas \
--neutral_text "a face" \
--target_tex "a face with glasses"

bibtex:

@article{peng2023hifisketch,
  title={HiFiSketch: High Fidelity Face Photo-Sketch Synthesis and Manipulation},
  author={Peng, Chunlei and Zhang, Congyu and Liu, Decheng and Wang, Nannan and Gao, Xinbo},
  journal={IEEE Transactions on Image Processing},
  year={2023},
  publisher={IEEE}
}

Acknowledgments

Our code is inspired by Hyperstyle, e4e and stylegan2