Image Super-Resolution via Iterative Refinement

Brief

This is an unofficial implementation of Image Super-Resolution via Iterative Refinement(SR3) by Pytorch.

There are some implement details with paper description, which may be different from the actual SR3 structure due to details missing.

We used the ResNet block and channel concatenation style like vanilla DDPM.
We used the attention mechanism in low-resolution features(16×16) like vanilla DDPM.
We encode the $\gamma$ as FilM structure did in WaveGrad, and embed it without affine transformation.
We define posterior variance as $ \dfrac{1-\gamma_{t-1}}{1-\gamma_{t}} \beta_t $ rather than $\beta_t$, which have the similar results in vanilla paper.

If you just want to upscale 64x64px -> 512x512px images using the pre-trained model, check out this google colab script.

Status

★★★ NEW: Its follow-up Palette-Image-to-Image-Diffusion-Models is now available; See the details here ★★★

Conditional generation(super resolution)

16×16 -> 128×128 on FFHQ-CelebaHQ
64×64 -> 512×512 on FFHQ-CelebaHQ

Unconditional generation

128×128 face generation on FFHQ
~~1024×1024 face generation by a cascade of 3 models~~

Training Step

Results

Note: We set the maximum reverse steps budget to 2000 now. Limited to model parameters in Nvidia 1080Ti, image noise and hue deviation occasionally appear in high-resolution images, resulting in low scores. There is a lot of room to optimization. Welcome to any contributions for more extensive experiments and code enhancements.

Tasks/Metrics	SSIM(+)	PSNR(+)	FID(-)	IS(+)
16×16 -> 128×128	0.675	23.26	-	-
64×64 -> 512×512	0.445	19.87	-	-
128×128	-	-
1024×1024	-	-

16×16 -> 128×128 on FFHQ-CelebaHQ [More Results]

64×64 -> 512×512 on FFHQ-CelebaHQ [More Results]

128×128 face generation on FFHQ [More Results]

Usage

Environment

pip install -r requirement.txt

Pretrained Model

This paper is based on "Denoising Diffusion Probabilistic Models", and we build both DDPM/SR3 network structures, which use timesteps/gama as model embedding input, respectively. In our experiments, SR3 model can achieve better visual results with the same reverse steps and learning rate. You can select the JSON files with annotated suffix names to train the different models.

Tasks	Platform（Code：qwer)
16×16 -> 128×128 on FFHQ-CelebaHQ	Google Drive\|Baidu Yun
64×64 -> 512×512 on FFHQ-CelebaHQ	Google Drive\|Baidu Yun
128×128 face generation on FFHQ	Google Drive\|Baidu Yun

# Download the pretrain model and edit [sr|sample]_[ddpm|sr3]_[resolution option].json about "resume_state":
"resume_state": [your pretrain model path]

Data Prepare

New Start

If you didn't have the data, you can prepare it by following steps:

Download the dataset and prepare it in LMDB or PNG format using script.

# Resize to get 16×16 LR_IMGS and 128×128 HR_IMGS, then prepare 128×128 Fake SR_IMGS by bicubic interpolation
python data/prepare_data.py  --path [dataset root]  --out [output root] --size 16,128 -l

then you need to change the datasets config to your data path and image resolution:

"datasets": {
    "train": {
        "dataroot": "dataset/ffhq_16_128", // [output root] in prepare.py script
        "l_resolution": 16, // low resolution need to super_resolution
        "r_resolution": 128, // high resolution
        "datatype": "lmdb", //lmdb or img, path of img files
    },
    "val": {
        "dataroot": "dataset/celebahq_16_128", // [output root] in prepare.py script
    }
},

Own Data

You also can use your image data by following steps, and we have some examples in dataset folder.

At first, you should organize the images layout like this, this step can be finished by data/prepare_data.py automatically:

# set the high/low resolution images, bicubic interpolation images path 
dataset/celebahq_16_128/
├── hr_128 # it's same with sr_16_128 directory if you don't have ground-truth images.
├── lr_16 # vinilla low resolution images
└── sr_16_128 # images ready to super resolution

# super resolution from 16 to 128
python data/prepare_data.py  --path [dataset root]  --out celebahq --size 16,128 -l

Note: Above script can be used whether you have the vanilla high-resolution images or not.

then you need to change the dataset config to your data path and image resolution:

"datasets": {
    "train|val": { // train and validation part
        "dataroot": "dataset/celebahq_16_128",
        "l_resolution": 16, // low resolution need to super_resolution
        "r_resolution": 128, // high resolution
        "datatype": "img", //lmdb or img, path of img files
    }
},

Training/Resume Training

# Use sr.py and sample.py to train the super resolution task and unconditional generation task, respectively.
# Edit json files to adjust network structure and hyperparameters
python sr.py -p train -c config/sr_sr3.json

Test/Evaluation

# Edit json to add pretrain model path and run the evaluation 
python sr.py -p val -c config/sr_sr3.json

# Quantitative evaluation alone using SSIM/PSNR metrics on given result root
python eval.py -p [result root]

Inference Alone

Set the image path like steps in Own Data, then run the script:

# run the script
python infer.py -c [config file]

Weights and Biases 🎉

The library now supports experiment tracking, model checkpointing and model prediction visualization with Weights and Biases. You will need to install W&B and login by using your access token.

pip install wandb

# get your access token from wandb.ai/authorize
wandb login

W&B logging functionality is added to sr.py, sample.py and infer.py files. You can pass -enable_wandb to start logging.

-log_wandb_ckpt: Pass this argument along with -enable_wandb to save model checkpoints as W&B Artifacts. Both sr.py and sample.py is enabled with model checkpointing.
-log_eval: Pass this argument along with -enable_wandb to save the evaluation result as interactive W&B Tables. Note that only sr.py is enabled with this feature. If you run sample.py in eval mode, the generated images will automatically be logged as image media panel.
-log_infer: While running infer.py pass this argument along with -enable_wandb to log the inference results as interactive W&B Tables.

You can find more on using these features here. 🚀

Acknowledge

Our work is based on the following theoretical works:

and we are benefiting a lot from the following projects:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Super-Resolution via Iterative Refinement

Brief

Status

Conditional generation(super resolution)

Unconditional generation

Training Step

Results

16×16 -> 128×128 on FFHQ-CelebaHQ [More Results]

64×64 -> 512×512 on FFHQ-CelebaHQ [More Results]

128×128 face generation on FFHQ [More Results]

Usage

Environment

Pretrained Model

Data Prepare

New Start

Own Data

Training/Resume Training

Test/Evaluation

Inference Alone

Weights and Biases 🎉

Acknowledge

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
.vscode		.vscode
config		config
core		core
data		data
dataset		dataset
misc		misc
model		model
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval.py		eval.py
infer.py		infer.py
requirement.txt		requirement.txt
sample.py		sample.py
sr.py		sr.py

License

harnvo/Image-Super-Resolution-via-Iterative-Refinement

Folders and files

Latest commit

History

Repository files navigation

Image Super-Resolution via Iterative Refinement

Brief

Status

Conditional generation(super resolution)

Unconditional generation

Training Step

Results

16×16 -> 128×128 on FFHQ-CelebaHQ [More Results]

64×64 -> 512×512 on FFHQ-CelebaHQ [More Results]

128×128 face generation on FFHQ [More Results]

Usage

Environment

Pretrained Model

Data Prepare

New Start

Own Data

Training/Resume Training

Test/Evaluation

Inference Alone

Weights and Biases 🎉

Acknowledge

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages