Im3D - Single View Image to 3D Point Cloud Reconstruction

Introduction

This project focuses on reconstructing 3D point clouds from single-view RGB images, specifically within the chair class of the ShapeNet rendered dataset. The objective was to generate accurate 3D point clouds from 2D images using two different architectures, each with its own strengths in terms of latent space manipulation and control over the output.

PCA + AutoEncoder: Used to manipulate the parameters of learned latent vector towards generating 3D point clouds, but did not allow conditoned generation.
Conditional Variational Auto Encoder : Used to obtain conditoned 3D point clouds by conditoning one hot encoded vector in the latent space.

System Architecture

In this section we will discuss about two architectures implemented for the point cloud generation.

1. Principal Component Analysis (PCA) + Auto-Encoder

This approach begins with feature extraction of 2D input image using pre-trained resnet-18 architecture, the vector obtained undergoes PCA tp reduce its dimensionality to 1x4 (as we assume there are 4 important parts in a chair), by manipualting parameters of this vector we generate the 3D point cloud, but since the vector is learned and we need to estimate parameters mapping to each part of the chair, it restricts user-controlled single attribute manipulation.

2. Conditional Variational Auto Encoder (C-VAE)

From the previous architecture, the need for user-controlled generation led to the development of this new model. In this approach, a one-hot encoded vector is incorporated into the latent space, allowing the generation of 3D point clouds conditioned on the specific input vector. During inference, a random latent vector is sampled, and the desired output is generated by providing a user-defined condition through the conditional vector. This enables more control and flexibility in producing the desired 3D point cloud reconstructions based on user preferences. (The one hot encoding vector is 17 dimensional vector each parameter corresponds to a specifc semantic of the chair.)

Results from PCA + Auto Encoder (Architecture 1)

Results from PCA Varied, first row represents chair type1 and second row of type2, with each of 4 components individually varied.

Results from C-VAE (Architecture 2)


Here the conditonal vector was to generate "Swivel legs" during the inference from a random noise vector	Here the conditon was to generate "Cantilever legs" during the inference from a random noise vector.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
Images		Images
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Im3D - Single View Image to 3D Point Cloud Reconstruction

Introduction

System Architecture

1. Principal Component Analysis (PCA) + Auto-Encoder

2. Conditional Variational Auto Encoder (C-VAE)

Results from PCA + Auto Encoder (Architecture 1)

Results from C-VAE (Architecture 2)

About

Releases

Packages

Jatinkalal/IM3D

Folders and files

Latest commit

History

Repository files navigation

Im3D - Single View Image to 3D Point Cloud Reconstruction

Introduction

System Architecture

1. Principal Component Analysis (PCA) + Auto-Encoder

2. Conditional Variational Auto Encoder (C-VAE)

Results from PCA + Auto Encoder (Architecture 1)

Results from C-VAE (Architecture 2)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages