release_polish

PKU-EPIC · Jul 5, 2023 · 4646b0c · 4646b0c
commit 4646b0c
Show file tree

Hide file tree

Showing 99 changed files with 15,429 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,7 @@
+__pycache__/
+GAPartNet_All
+perception/
+wandb/
+ckpt/
+image_kuafu
+output/GAPartNet_result/
diff --git a/README.md b/README.md
@@ -0,0 +1,100 @@
+<h2 align="center">
+  <b>GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts</b>
+
+  <b><i>CVPR 2023 Highlight</i></b>
+
+
+<div align="center">
+    <a href="https://cvpr.thecvf.com/virtual/2023/poster/22552" target="_blank">
+    <img src="https://img.shields.io/badge/CVPR 2023-Highlight-red"></a>
+    <a href="https://arxiv.org/abs/2211.05272" target="_blank">
+    <img src="https://img.shields.io/badge/Paper-arXiv-green" alt="Paper arXiv"></a>
+    <a href="https://pku-epic.github.io/GAPartNet/" target="_blank">
+    <img src="https://img.shields.io/badge/Page-GAPartNet-blue" alt="Project Page"/></a>
+</div>
+</h2>
+
+This is the official repository of [**GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts**](https://arxiv.org/abs/2211.05272).
+
+For more information, please visit our [**project page**](https://pku-epic.github.io/GAPartNet/).
+
+
+## 💡 News
+- `2023/6/28` We polish our model with user-friendly Lightning framework and release detailed training code! Check gapartnet folder for more details!
+
+- `2023/5/21` GAPartNet Dataset has been released, including Object & Part Assets and Annotations, Rendered PointCloud Data and our Pre-trained Checkpoint.
+
+## GAPartNet Dataset
+
+(New!) GAPartNet Dataset has been released, including Object & Part Assets and Annotations, Rendered PointCloud Data and our Pre-trained Checkpoint.
+
+To obtain our dataset, please fill out [**this form**](https://forms.gle/3qzv8z5vP2BT5ARN7) and check the [**Terms&Conditions**](https://docs.google.com/document/d/1kjFCTcDLtaycZiJVmSVhT9Yw8oCAHl-3XKdJapvRdW0/edit?usp=sharing). Please cite our paper if you use our dataset.
+
+Download our pretrained checkpoint [**here**](https://drive.google.com/file/d/1D1PwfXPYPtxadthKAJdehhIBbPEyBB6X/view?usp=sharing)! (Notive that the checkpoint in the dataset is expired, please use the this one.)
+
+## GAPartNet Network and Inference
+
+We release our network and checkpoint, check gapartnet folder for more details. You can segment part 
+and estimate the pose of it. We also provide visualization code. This is an visualization example:
+![example](gapartnet/output/example.png)
+![example2](gapartnet/output/example2.png)
+
+## How to use our code and model: 
+
+### 1. Install dependencies
+  - Python 3.8
+  - Pytorch >= 1.11.0
+  - CUDA >= 11.3
+  - Open3D with extension (See install guide below)
+  - epic_ops (See install guide below)
+  - pointnet2_ops (See install guide below)
+  - other pip packages
+
+### 2. Install Open3D & epic_ops & pointnet2_ops
+  See this repo for more details:
+
+  [GAPartNet_env](https://github.com/geng-haoran/GAPartNet_env): This repo includes Open3D, [epic_ops](https://github.com/geng-haoran/epic_ops) and pointnet2_ops. You can install them by following the instructions in this repo.
+
+### 3. Download our model and data
+  See gapartnet folder for more details.
+
+### 4. Inference and visualization
+  ```
+  cd gapartnet
+
+  CUDA_VISIBLE_DEVICES=0 \
+  python train.py test -c gapartnet.yaml \
+  --model.init_args.ckpt ckpt/new.ckpt
+  ```
+
+### 5. Training
+  You can run the following code to train the policy:
+  ```
+  cd gapartnet
+
+  CUDA_VISIBLE_DEVICES=0 \
+  python train.py fit -c gapartnet.yaml
+  ```
+
+## Citation
+If you find our work useful in your research, please consider citing:
+
+```
+@article{geng2022gapartnet,
+  title={GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts},
+  author={Geng, Haoran and Xu, Helin and Zhao, Chengyang and Xu, Chao and Yi, Li and Huang, Siyuan and Wang, He},
+  journal={arXiv preprint arXiv:2211.05272},
+  year={2022}
+}
+```
+
+## Contact
+If you have any questions, please open a github issue or contact us:
+
+Haoran Geng: [email protected]
+
+Helin Xu: [email protected]
+
+Chengyang Zhao: [email protected]
+
+He Wang: [email protected]
diff --git a/dataset/.gitignore b/dataset/.gitignore
@@ -0,0 +1,14 @@
+.DS_Store
+*/.DS_Store
+*.pyc
+*.png
+log_*.txt
+
+__pycache__/
+example_data/
+example_rendered/
+sampled_data/
+visu/
+
+dataset_*/
+log_*/
diff --git a/dataset/README.md b/dataset/README.md
@@ -0,0 +1,100 @@
+# GAPartNet Dataset
+
+## Data Format
+
+The GAPartNet dataset is built based on two exsiting datasets, PartNet-Mobility and AKB-48, from which the 3D object shapes are collected, cleaned, and equipped with new uniform GAPart-based semantics and poses annotations. The model_ids we use are provided in `render_tools/meta/{partnet_all_id_list.txt, akb48_all_id_list.txt}`.
+
+Four additional files accompany each object shape from PartNet-Mobility, providing annotations in the following formats:
+
+- `semantics_gapartnet.txt`: This file contains link semantics. Each line corresponds to a link in the kinematic chain, as indicated in `mobility_annotation_gapartnet.urdf`, formatted as "[link_name] [joint_type] [semantics]".
+- `mobility_annotation_gapartnet.urdf`: This document describes the kinematic chain, including our newly re-merged links and modified meshes. Each GAPart in the object shape corresponds to an individual link. We recommend using this file for annotation (semantics, poses) rendering and part properties queries.
+- `mobility_texture_gapartnet.urdf`: This file also describes the kinematic chain but uses the original meshes. Each GAPart in the kinematic chain is not guaranteed to be an individual link. In our paper, we mentioned that since the GAPart semantics are newly defined, the meshes and annotations in the original assets may be inconsistent with our definition, which requires a finer level of detail. For example, in the original mesh for "Oven" or "Dishwasher," a line_fixed_handle and a hinge_door could be attached into a single .obj mesh file. To address this issue, we modified the meshes to separate the GAParts. However, these mesh modifications may have caused issues in the broken texture, resulting in poor quality in rendering. As a temporary solution, we provide this file and use the original meshes for texture rendering. The examplar code for the joint correspondence between the kinematic chains in `mobility_annotation_gapartnet.urdf` and `mobility_texture_gapartnet.urdf` can be found in our rendering toolkit.
+- `link_annotation_gapartnet.json`: The json file contains GAPart semantics and pose of each link in the kinematic chain in `mobility_annotation_gapartnet.urdf`. Spefically, for each link, "link_name", "is_gapart", "category", "bbox" are provided, where "bbox" are the 3D bounding box position of the part in the rest state, i.e., all joint states are set to zero. The order of the eight vertices is as follows: [(-x,+y,+z), (+x,+y,+z), (+x,-y,+z), (-x,-y,+z), (-x,+y,-z), (+x,+y,-z), (+x,-y,-z), (-x,-y,-z)].
+
+## Data Split
+
+The data splits used in our paper can be found in `render_tools/meta/{partnet_all_split.json, akb48_all_split.json}`. We split all 27 object categories into 17 seen and 10 unseen categories. Each seen category was further split into seen and unseen instances. This two-level split ensures that all GAPart classes exist in both seen and unseen object categories, which helps evaluate intra- and inter-category generalizability.
+
+## Rendering Toolkit
+
+We provide an example toolkit for rendering and visualizing our GAPartNet dataset, located in `render_tools/`. This toolkit relies on [SAPIEN](https://github.com/haosulab/SAPIEN). To use it, please check the requirements in `render_tools/requirements.txt` and install the required packages.
+
+To render a single view of an object shape, use the `render_tools/render.py` script with the following command:
+
+```
+python render.py --model_id {MODEL_ID} \
+								 --camera_idx {CAMERA_INDEX} \
+								 --render_idx {RENDER_INDEX} \
+								 --height {HEIGHT} \
+								 --width {WIDTH} \
+								 --ray_tracing {USE_RAY_TRACING} \
+								 --replace_texture {REPLACE_TEXTURE}
+```
+
+The parameters are as follows:
+
+- `MODEL_ID`: The ID of the object shape you want to render.
+- `CAMERA_INDEX`: The index of the selected camera position range. This index is pre-defined in `render_tools/config_utils.py`.
+- `RENDER_INDEX`: The index of the specific rendered view.
+- `HEIGHT`: The height of the rendered image.
+- `WIDTH`: The width of the rendered image.
+- `USE_RAY_TRACING`: A boolean value specifying whether to use ray tracing for rendering. Use 'true' to enable and 'false' to disable.
+- `REPLACE_TEXTURE`: A boolean value that determines whether to use the original texture or the modified texture for rendering. Set it to 'true' to use the original texture (better) and 'false' to use the modified.
+
+To render the entire dataset, utilize the `render_tools/render_all_partnet.py` script with the following command:
+
+``````shell
+python render_all_partnet.py --ray_tracing {USE_RAY_TRACING} \
+                             --replace_texture {REPLACE_TEXTURE} \
+                             --start_idx {START_INDEX} \
+                             --num_render {NUM_RENDER} \
+                             --log_dir {LOG_DIR}
+
+``````
+
+The parameters are defined as follows:
+
+- `USE_RAY_TRACING` and `REPLACE_TEXTURE`: These parameters are identical to those described earlier.
+- `START_INDEX`: Specifies the starting render index, which is the same as the `RENDER_INDEX` mentioned previously.
+- `NUM_RENDER`: Specifies the number of views to render for each object shape and camera range.
+- `LOG_DIR`: The directory where the log files will be saved.
+
+To visualize the rendering results, use the `render_tools/visualize.py` script with this command:
+
+```shell
+python visualize.py --model_id {MODEL_ID} \
+										--category {CATEGORY} \
+                    --camera_position_index {CAMERA_INDEX} \
+                    --render_index {RENDER_INDEX}
+```
+
+The parameters are as follows:
+
+- `MODEL_ID`: The ID of the object shape to visualize.
+- `CATEGORY`: The category of the object.
+- `CAMERA_INDEX`: The index of the selected range for the camera position, pre-defined in `render_tools/config_utils.py`.
+- `RENDER_INDEX`: The index of the view that you wish to visualize.
+
+
+## Pre-processing Toolkit
+
+In addition to the rendering toolkit, we also provide a pre-processing toolkit to convert the rendered results into our model's input data format. This toolkit loads the rendered results, generates a partial point cloud via back-projection, and uses Farthest-Point-Sampling (FPS) to sample points from the dense point cloud.
+
+To use the toolkit, first install the PointNet++ library in `process_tools/utils/pointnet_lib` with the following command: `python setup.py install`. This installation will enable FPS performance on GPU. The library is sourced from [HalfSummer11/CAPTRA](https://github.com/HalfSummer11/CAPTRA), which is based on [sshaoshuai/Pointnet2.PyTorch](https://github.com/sshaoshuai/Pointnet2.PyTorch) and [yanx27/Pointnet_Pointnet2_pytorch](https://github.com/yanx27/Pointnet_Pointnet2_pytorch).
+
+To pre-process the rendered results, use the `process_tools/convert_rendered_into_input.py` script with the following command:
+
+```shell
+python convert_rendered_into_input.py --data_path {DATA_PATH} \
+                                      --save_path {SAVE_PATH} \
+                                      --num_points {NUM_POINTS} \
+                                      --visualize {VISUALIZE}
+```
+
+The parameters are as follows:
+
+- `DATA_PATH`: Path to the directory containing the rendered results.
+- `SAVE_PATH`: Path to the directory where the pre-processed results will be stored.
+- `NUM_POINTS`: The number of points to sample from the partial point cloud.
+- `VISUALIZE`: A boolean value indicating whether to visualize the pre-processed results. Use 'true' to enable and 'false' to disable.
+