HiFiSketch

We provide PyTorch implementations for our TIP2023 paper HifiSketch: High Fidelity Face Photo-Sketch Synthesis and Manipulation

This project can generate face sketch from photos and edit the sketch through text. Paper@IEEE Code@Github

Prerequisites

Linux
Python 3.7
NVIDIA GPU + CUDA + CuDNN

Getting Started

Installation

Clone this repo:

git clone https://github.com/shenhaiyoualn/HiFiSketch
cd HiFiSketch

*The environment file is defined in environment/environment.yaml, Install all the dependencies by:

conda env create -f environment.yml

Prepare

Download your dataset and put it in the dataset directory, then update configs/path_config.py like:

dataset_paths = {
  'CUHK_train_P': '/media/gpu/T7/HifiSketch/datasets/CUHK_train_Photo',
  'CUHK_train_S': '/media/gpu/T7/HifiSketch/datasets/CUHK_train_Sketch',
  'CUHK_test_P': '/media/gpu/T7/HifiSketch/datasets/CUHK_test_Photo',
  'CUHK_test_S': '/media/gpu/T7/HifiSketch/datasets/CUHK_test_Sketch',}

Update configs/data_conf.py like:

DATASETS = {
  'CUHK': {
  	'transforms': trans_conf.EncodeTransforms,
  	'train_source_root': dataset_paths['CUHK_train_P'],
  	'train_target_root': dataset_paths['CUHK_train_S'],
  	'test_source_root': dataset_paths['CUHK_test_P'],
  	'test_target_root': dataset_paths['CUHK_test_S'],
  },}

Our model uses a lot of pre-trained models, you can find them below：

Path	Description
FFHQ StyleGAN	pretrained StyleGAN2 model with 1024x1024 resolution.
Faces W-Encoder	Pretrained e4e encoder.
IR-SE50 Model	Pretrained IR-SE50 model taken from TreB1eN
ResNet-34 Model	ResNet-34 model trained on ImageNet taken from torchvision.
MTCNN	Weights for MTCNN model taken from TreB1eN for use in ID similarity.

Please note that the generator we use is derived from rosinality‘s code.

Training and Inference

Train a model

CUDA_VISIBLE_DEVICES="0" python scripts/train.py \
--dataset_type=CUHK \
--encoder_type=hifinet \
--exp=experiments/CUHK \
--workers=1 \
--batch_size=4 \
--test_batch_size=2 \
--test_workers=1 \
--val_interval=5000 \
--save_interval=5000 \
--n_iters_per_batch=1 \
--max_val_batches=150 \
--output_size=1024 \
--load_w_encoder

You can modify the training parameters in the options/train_options.py file.

inference the model

CUDA_VISIBLE_DEVICES="0" python scripts/inference.py \
--exp=experiments/CUHK \
--checkpoint_path=/model/path \
--data_path=/your/test/data/path \
--test_batch_size=4 \
--test_workers=4 \
--n_iters_per_batch=1 \
--load_w_encoder \
--w_encoder_checkpoint_path pretrained_models/faces_encoder.pt

You can use --save_weight_deltas to save the final weight and adjust the --n_iters_per_batch parameters to get more realistic effects. You can find all the test parameters in the options/test_options.py file.

Editing

You can edit the generated image through text by：

python editing/edit/edit.py \
--exp /your/experiment/dir \
--weight_deltas_path /your/weight_deltas \
--neutral_text "a face" \
--target_tex "a face with glasses"

bibtex:

@article{peng2023hifisketch,
  title={HiFiSketch: High Fidelity Face Photo-Sketch Synthesis and Manipulation},
  author={Peng, Chunlei and Zhang, Congyu and Liu, Decheng and Wang, Nannan and Gao, Xinbo},
  journal={IEEE Transactions on Image Processing},
  year={2023},
  publisher={IEEE}
}

Acknowledgments

Our code is inspired by Hyperstyle, e4e and stylegan2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

HiFiSketch

Prerequisites

Getting Started

Installation

Prepare

Training and Inference

Editing

Acknowledgments

Files

README.md

Latest commit

History

README.md

File metadata and controls

HiFiSketch

Prerequisites

Getting Started

Installation

Prepare

Training and Inference

Editing

Acknowledgments