Skip to content

NVlabs/affordance_diffusion

Repository files navigation

Affordance Diffusion: Synthesizing Hand-Object Interactions

Yufei Ye, Xueting Li, Abhinav Gupta, Shalini De Mello, Stan Birchfield, Jiaming Song, Shubham Tulsiani, Sifei Liu

in CVPR2023

Tl;dr: Given a single RGB image of an object, hallucinate plausible ways of human interacting with it.

[Project Page] [Video] [Arxiv] [Data Generation]

Installation

See install.md

Inference

HOI synthesis

python inference.py data.data_dir='docs/demo/*.*g' test_num=3

Inference script first synthesizes $test_num HOI images in batch and then extract 3D hand pose.

Input Synthesized HOI images Extracted 3D Hand Pose

Interpolation

The script takes in the layout parameter of the $index-th example predicted from inference.py, and smoothly interpolates the HOI synthesis to the horizontally flipped parameters. To run demo,

python -m scripts.interpolate dir=docs/demo_inter

This should gives results similar to:

Input Interpolated Layouts Output
Addtional parameters ``` python -m scripts.interpolate dir=\${output}/release/layout/cascade index=0000_00_s0 ```
  • interpolation.len: length of a interpolation sequence
  • interpolation.num: number of interpolation sequences
  • interpolation.test_name: subfolder to save the output
  • interpolation.orient: whether to horizontally flip approaching direction

Heatmap Guidance

The following command runs guided generation with keypoints in docs/demo_kpts

python inference.py  mode=hijack data.data_dir='docs/demo_kpts/*.png' test_name=hijack

This should gives results similar to:

Input 1 Output 1 Input 2 Output 2

Training

Data Preprocessing

We provide the script to generate the HO3Pair dataset. Please see preprocess/.

Train your own models

python -m models.base -m  --config-name=train \
  expname=reproduce/\${model.module} \
  model=layout 
python -m models.base -m  --config-name=train \
  expname=reproduce/\${model.module} \
  model=content_glide
  • ContentNet-LDM: First download off-shelf pretrained model from here and put it under ${environment.pretrain}/stable/inpaint.ckpt specified in configs/model/content_ldm.yaml:resume_ckpt
python -m models.base -m  --config-name=train \
  expname=reproduce/\${model.module} \
  model=content_ldm 

Split and test images

Per-category HOI4D instance splits (was not used in the paper), test images on HOI4D and EPIC-KITCHENS(VISOR) can be downloaded here.

License

This project is licensed under CC-BY-NC-SA-4.0. Redistribution and use should follow this license.

Acknowledgement

Affordance Diffusion leverages many amazing open-sources shared in research community:

Citation

If you use find this work helpful, please consider citing:

 @inproceedings{ye2023affordance,
                title={Affordance Diffusion: Synthesizing Hand-Object Interactions},
                author={Yufei Ye and Xueting Li and Abhinav Gupta
                        and Shalini De Mello and Stan Birchfield and Jiaming Song
                        and Shubham Tulsiani and Sifei Liu},
                year={2023},
                booktitle ={CVPR},
            }