Project Page | ProciGen Dataset | Model trained in ProciGen | Paper
ProciGen is a synthetic human-object interaction dataset with 1M+ images of human interacting with 21k+ different objects. We achieve this by procedurally combining human, object and interaction datasets together. This repo provides code to access and generate ProciGen dataset.
- Dataset download links.
- Dataset structure.
- Contact transfer and optimization.
- Render new human object interaction.
- Autoencoder to compute dense correspondence.
- Re-render ProciGen interaction images.
- Dec 31, 2024. Code to synthesize and render interaction released!
- July 08, 2024. Code to align objects and re-render ProciGen dataset released!
This code is tested on python 3.10, ubuntu 22.04
. See below for details on installing runtime environment.
If you want to only re-render the ProciGen sequences, the following steps are sufficient:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install bpy==3.5.0
pip install git+https://github.com/mattloper/chumpy # compatible with newer numpy version
pip install opencv-python trimesh tqdm open3d objaverse imageio-ffmpeg
Additional dependencies for synthesizing new interactions:
conda install -c conda-forge igl
# Install pytorch3d
pip install iopath
pip install --no-index --no-cache-dir pytorch3d -f 'https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py310_cu121_pyt251/download.html'
Install Mesh intersection library: (this works for cuda 10, 11, and 12.)
export CHORE_PATH=${PWD}
git clone https://github.com/NVIDIA/cuda-samples.git external/cuda-samples
export CUDA_SAMPLES_INC=${CHORE_PATH}/external/cuda-samples/Common/
git clone https://github.com/vchoutas/torch-mesh-isect external/torch-mesh-isect
cp external/torch-mesh-isect/include/double_vec_ops.h external/torch-mesh-isect/src/
Add these lines to external/torch-mesh-isect/src/bvh.cpp
before AT_CHECK
is defined, i.e. this line (reference):
#ifndef AT_CHECK
#define AT_CHECK TORCH_CHECK
#endif
finally run pip install external/torch-mesh-isect/
Download links: dataset can be downloaded from edmond. By downloading the dataset, you agree the LICENSE.
We follow the similar structure as the behave dataset. Namely, each sequence is organized as follows:
|--ROOT
|----sequence_name
|------info.json # a file storing the calibration information for the sequence
|------<frame_name> # one frame folder
|--------k*.color.jpg # color images of the frame
|--------k*.person_mask.png # human masks
|--------k*.obj_rend_mask.png # object masks
|--------k*.obj_rend_full.png # object masks without occlusion
|--------k*.depth.png # [optional] depth images
|--------k*.grid_df_res128_b0.5.npz # [optional] precomputed occupancy for human object segmentation
|--------person
|----------fit01 # registered SMPL-H mesh and parameters
|--------<object_name>
|----------fit01 # object registrations
We store the SMPL-H parameters and corresponding mesh inside each person/fit01
folder. We use the MANO_V1.2 pkl model with 10 shape parameters and no PCA compression for the hand poses. If you would like to use other body models e.g. SMPL or SMPL-X, please refer to this repo for conversions between different body models.
We save the simplified object mesh in file <object_name>/fit01/<object_name>_fit.ply
.
The corresponding object pose and other information are saved in <object_name>/fit01/<object_name>_fit.pkl
.
Inside the pickle file, you can find the fields synset_id
and ins_name
, which can be used to identified the original object model with texture.
For the ProciGen dataset, we used object models from ShapeNet, Objaverse and ABO datasets. You can identify the orignal model from these datasets similarly:
- ABO shapes: Sequences using this dataset can be identified as the names with suffix containing
abo
. Theins_name
is the3dmodel_id
from ABO dataset, which is a unique id to identify the object model. - Objaverse shapes: The following categories are from objaverse: backpack, basketball, all boxes, stool, suitcase, yoga ball. Similar to ABO, the
ins_name
is theuid
in Objaverse which can be directly used to download the object model. - Shapenet shapes: Besides ABO and Objaverse, all other sequences use shapes from ShapeNet. The ShapeNet model can be identified as
<ShapeNet_root>/<synset_id>/<ins_name>/models
.
We implement a function to get shape dataset from sequence name, see get_shape_datasetname
in render/utils.py
.
Using the object pose: given object point p at the canonical shape space, it can be transformed to the current interaction space by simply p'=Rp + t
, where R, t
are from entries rot, trans
stored in the <object_name>_fit.pkl
file.
Example: you can find examples on how to align from original dataset to our ProciGen in render/align_shapes.py
. Note that blender needs to be installed
in your system for objaverse and ABO dataset. The code was tested with blender 2.91 and 3.5. See re-render ProciGen for details about object dataset preparation.
We use the camera parameters from BEHAVE and InterCap to render the synthesized interaction. The camera intrinsic and extrinsic are saved in <sequence>/info.json
file.
For more details about reading and loading, please check load intrinsics, extrinsics.
We provide quick start example below, for more advanced usage, please refer to synthesize ProciGen.
We provide some mini examples for quick start, this allows you to test re-render ProciGen dataset, synthesize and render interaction with new human and objects. Download it with:
bash scripts/download_demo.sh
SMPLH body model: We use SMPL-H (mano_v1.2) from this website. Download and unzip to a local path and modify in SMPL_MODEL_ROOT
in paths.py
.
To render the sequence downloaded from demo data, run
python render/blender_rerender.py -s example/ProciGen/Date04_Subxx_stool_batch01 -o outputs/rerender
This will save the re-rerendered images to outputs/rerender
. Use -s <ProciGen sequence path>
to re-render any other sequence.
With the demo data downloaded, run:
python synz/synz_batch.py --mode demo
This will sample 16 interactions from BEHAVE stool sequence and synthesize their interaction with new stool shapes. Results will be saved to outputs/params
.
To render new interactions, run:
python render/render_hoi.py -p outputs/params/test-stool --source objaverse --obj_name stool -o outputs/render
This will render the synthesized interactions to outputs/render
.
For synthesizing or re-rendering more objects from other datasets, please refer to synthesize ProciGen.
If you encounter segmentation fault
when running synz/synz_batch.py
, please check if the installed numpy version is lower than 2.0.
Open3d has problem with newer numpy versions, downgrade to numpy 1.26.4
could potentially solve this problem.
Please see LICENSE.
If you use the data or code, please cite:
@inproceedings{xie2023template_free,
title = {Template Free Reconstruction of Human-object Interaction with Procedural Interaction Generation},
author = {Xie, Xianghui and Bhatnagar, Bharat Lal and Lenssen, Jan Eric and Pons-Moll, Gerard},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
}
Our dataset leverages these datasets, please also consider cite them:
@inproceedings{bhatnagar22behave,
title = {BEHAVE: Dataset and Method for Tracking Human Object Interactions},
author = {Bhatnagar, Bharat Lal and Xie, Xianghui and Petrov, Ilya and Sminchisescu, Cristian and Theobalt, Christian and Pons-Moll, Gerard},
booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2022},
}
@inproceedings{huang2022intercap,
title = {{InterCap}: {J}oint Markerless {3D} Tracking of Humans and Objects in Interaction},
author = {Huang, Yinghao and Taheri, Omid and Black, Michael J. and Tzionas, Dimitrios},
booktitle = {{German Conference on Pattern Recognition (GCPR)}},
volume = {13485},
pages = {281--299},
year = {2022},
organization = {Springer},
series = {Lecture Notes in Computer Science}
}
@techreport{shapenet2015,
title = {{ShapeNet: An Information-Rich 3D Model Repository}},
author = {Chang, Angel X. and Funkhouser, Thomas and Guibas, Leonidas and Hanrahan, Pat and Huang, Qixing and Li, Zimo and Savarese, Silvio and Savva, Manolis and Song, Shuran and Su, Hao and Xiao, Jianxiong and Yi, Li and Yu, Fisher},
number = {arXiv:1512.03012 [cs.GR]},
institution = {Stanford University --- Princeton University --- Toyota Technological Institute at Chicago},
year = {2015}
}
@article{collins2022abo,
title={ABO: Dataset and Benchmarks for Real-World 3D Object Understanding},
author={Collins, Jasmine and Goel, Shubham and Deng, Kenan and Luthra, Achleshwar and
Xu, Leon and Gundogdu, Erhan and Zhang, Xi and Yago Vicente, Tomas F and
Dideriksen, Thomas and Arora, Himanshu and Guillaumin, Matthieu and
Malik, Jitendra},
journal={CVPR},
year={2022}
}
@article{objaverse,
title={Objaverse: A Universe of Annotated 3D Objects},
author={Matt Deitke and Dustin Schwenk and Jordi Salvador and Luca Weihs and
Oscar Michel and Eli VanderBilt and Ludwig Schmidt and
Kiana Ehsani and Aniruddha Kembhavi and Ali Farhadi},
journal={arXiv preprint arXiv:2212.08051},
year={2022}
}
@inproceedings{bhatnagar2019mgn,
title = {Multi-Garment Net: Learning to Dress 3D People from Images},
author = {Bhatnagar, Bharat Lal and Tiwari, Garvita and Theobalt, Christian and Pons-Moll, Gerard},
booktitle = {{IEEE} International Conference on Computer Vision ({ICCV})},
month = {oct},
organization = {{IEEE}},
year = {2019},
}
This project leverages the following excellent works, we thank the authors for open-sourcing their code and data: