3D computer vision enables us to understand the spatial arrangement, orientation, shape, and volumetric characteristics of objects in the 3D world, leading to high-level semantic insights. This repository is dedicated to tutorials on 3D computer vision, focusing solely on learning-based methodologies, particularly with neural networks.
This notebook introduces common formats for representing 3D objects, including meshes, point clouds, and voxels. It demonstrates how to use PyTorch3D for rendering these representations, as illustrated below:
3D reconstruction from a single view is very similar to the process through which we recognize objects in the real world. When we look at a chair from one angle, we know it is a chair and can intuitively imagine what it would look like from other angles. It’s not like a chair viewed from one angle will look like an airplane from another angle. That being said, if you were determined to design an airplane that looks like a chair from a specific viewpoint, then everything in this post is inapplicable. 🤣
Neural Radiance Fields (NeRF) is a revolutionary approach to computer graphics and vision for synthesizing highly realistic images from sparse sets of images. At its core, NeRF models the continuous volumetric scene function using a multi-layer perceptron (MLP), mapping spatial coordinates and viewing directions to color and density. In this tutorial, I aim to demystify NeRF, explaining NeRF in detail and implementing it using PyTorch from scratch.
NOTE: More work is required to make inverse sampling and fine sampling work.
3D Gaussian Splatting (3DGS) is a powerful technique for generating novel views from a set of images and their poses. In this section, I will cover the basics of 3DGS.
NOTE: Advanced techniques in 3DGS, such as splitting and deleting 3D Gaussians, are
not yet implemented. The code in Deep_Dive_into_3D_Gaussian_Splatting/
has been
tested only on an NVIDIA L4 GPU with 24 GB memory.
All the results can be reproduced by following the instructions below.
-
NVIDIA driver, CUDA Toolkit, and cuDNN libraries: The system must installed recent NVIDIA driver, CUDA Toolkit, and cuDNN libraries, which are prerequisites for PyTorch and PyTorch3D. Here are the software versions which have been tested:
NVIDIA driver: 530.30.02 cuda toolkit: 12.1 cudnn: 8.9.7
If you encounter any problems while installing or updating them, you could consult this guide.
-
Python3:
Python 3.10
is used throughout all the developments for the problems.
-
Create and activate a Python virtual environment named
venv_3d_cv
, and updatepip
,setuptools
, andwheel
:python3.10 -m venv venv_3d_cv \ && source venv_3d_cv/bin/activate \ && python3 -m pip install --upgrade pip setuptools wheel
-
Install required general Python packages:
python3 -m pip install -r requirements.txt
-
Install required NVIDIA Python packages:
python3 -m pip install nvidia-pyindex && python3 -m pip install -r nvidia_requirements.txt
-
Install
python3-dev
bysudo apt install python3-dev
. -
The
~/.bashrc
has the following lines:# To Export or Not To Export LD_LIBRARY_PATH. Make Python find in venv* export PATH=/usr/local/cuda-12.1/bin:$PATH # export LD_LIBRARY_PATH=/usr/local/cuda-12.1/lib64:$LD_LIBRARY_PATH
-
Please visit the PyTorch official website to find the command to use for your system (CUDA 12.1):
python3 -m pip install torch torchvision torchaudio
-
The installation guide of PyTorch3D can be found here:
python3 -m pip install fvcore iopath && python3 -m pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"
Now you are ready to go to each folder and run the python script and the Jupyter
Notebook. Please remember to select the kernel you just created in your virtual
environment venv_3d_cv
.
This repository is a compilation of materials gathered from various online sources, each cited to acknowledge their origin.