You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issues: #122#14#69#121 report the fact that the load_and_transform_depth function is not implemented
I am raising this issue to implement the following data preprocessing steps in a PR, as it yield the reported 35% zero-shot classification for SUN-RGBD depth-only.
Important details for the scene classification task for SUNRGBD:
Scene subset:
The classification task only considers the following classes: SCENES = ['bathroom', 'bedroom', 'classroom', 'computer_room', 'conference_room', 'corridor', 'dining_area', 'dining_room', 'discussion_area', 'furniture_store', 'home_office', 'kitchen', 'lab', 'lecture_theatre', 'library', 'living_room', 'office', 'rest_space', 'study_space' ]
To reproduce the SUNRGBD results one has to convert the raw depth data to standardized disparity in the following steps:
Convert raw depth (uint16) to meters following the official SUN RGBD toolbox read3dPoints.mToolbox
Convert depth to disparity using correct camera intrinsics. Following the response of @imisra with different baselines for each camera. Focal length for each sample can be obtained from the intrinsics.txt file.
from pathlib import Path # Optional I just used pathlib
focal_path = Path(depth_file).parents[1] / "intrinsics.txt"
focal_length = float(focal_path.read_text().strip().split()[0])
baseline = get_baseline(depth_file)
disparity = baseline * focal_length / depth
def get_baseline(path: str) -> float:
if "kv1" in path:
return 0.075
elif "kv2" in path:
return 0.075
elif "realsense" in path:
return 0.095
elif "xtion" in path:
return 0.095 # guessed based on length of 18cm for ASUS xtion v1
else:
raise Exception(f"No baseline found for path: {path}")
Depth standardization by finding the mean and std of the disparity values across the training split. I find these values with the compute_depth_mean_std implementation from RGBD-Seg dataset_base.py.
This yields me the following mean and std values: mean: 24.82968 std: 14.40078
Which can be used to normalize (depending on the raw of refined mode) as follows (based on preprocessing.py Normalize):
if self._depth_mode == 'raw':
depth_0 = depth == 0
depth = torchvision.transforms.Normalize(
mean=24.82968, std=14.40078)(depth)
# set invalid values back to zero again
depth[depth_0] = 0
else:
depth = torchvision.transforms.Normalize(
mean=self.24.82968, std=14.40078)(depth)
Evaluated over the test split using above approach yield 35.2% depth accuracy
TODO: Create a PR
The text was updated successfully, but these errors were encountered:
OlafBraakman
changed the title
Implement load_and_transform_depth
Implement load_and_transform_depth_data
Feb 4, 2025
Issues: #122 #14 #69 #121 report the fact that the
load_and_transform_depth
function is not implementedI am raising this issue to implement the following data preprocessing steps in a PR, as it yield the reported 35% zero-shot classification for SUN-RGBD depth-only.
Important details for the scene classification task for SUNRGBD:
Scene subset:
The classification task only considers the following classes:
SCENES = ['bathroom', 'bedroom', 'classroom', 'computer_room', 'conference_room', 'corridor', 'dining_area', 'dining_room', 'discussion_area', 'furniture_store', 'home_office', 'kitchen', 'lab', 'lecture_theatre', 'library', 'living_room', 'office', 'rest_space', 'study_space' ]
To reproduce the SUNRGBD results one has to convert the raw depth data to standardized disparity in the following steps:
read3dPoints.m
Toolboxintrinsics.txt
file.compute_depth_mean_std
implementation from RGBD-Seg dataset_base.py.This yields me the following mean and std values:
mean: 24.82968
std: 14.40078
Which can be used to normalize (depending on the raw of refined mode) as follows (based on preprocessing.py Normalize):
Evaluated over the test split using above approach yield 35.2% depth accuracy
TODO: Create a PR
The text was updated successfully, but these errors were encountered: