Bonding boxes as class labels #8002

nowtryz · 2024-08-07T21:11:33Z

nowtryz
Aug 7, 2024

Hi,

For my current projet, I have and will have to work with bounding boxes a lot. I have boxes per classes and each class may have overlapping boxes.

I am still new to Monai and medical images in general, so please feel free to correct me if some of my questions/ideas come from a misunderstanding of either the librairies or the format of the images!

Context

To prepare data for the training, they have to pass through a collection of transformations combines in a Compose object. Such pipeline can be vulgarized like that:

Compose([
    LoadImaged(keys=["image"], ...),  # And somehow load 'box' as well
    Orientationd(keys=["image", "box"], axcodes="RAS"),
    Spacingd(keys=["image", "box"], ...),
    ScaleIntensityRanged(keys=["image"], ...),
    CropForegroundd(keys=["image", "box"], source_key="image", ...),
    SpatialPadd(keys=["image", "box"], ...),
    RandZoomd(keys=["image", "box"], ...),
    RandCropByPosNegLabeld(keys=["image", "box"], label_key="box", image_key="image", ...),
    RandRotate90d(keys=["image", "box"], ...),
    RandShiftIntensityd(keys=["image"], ...),
    # Then convert "box" to the desired format for the training
])

I used some weird tricks that did not really work then I took a look at the existing boxes functions I found:

Which include RandZoomBoxd, RandCropBoxByPosNegLabeld and RandRotateBox90d but not Orientationd, Spacingd, CropForegroundd, SpatialPadd, EnsureChannelFirstd.

Moreover, if I understand correctly, those boxes are custom made for detection, meaning I cannot store bounding boxes specific to classes, which is what I am looking for.

Feature request

**Would it be possible to have BoundingBox first class citizen like in torchvision but attached to classes? **

I am thinking of an nd Tensor that could support resampling and change of pixel dim:

class BoundingBox(MetaObj):
    spatial_shape: torch.Size  # used to infer the size of the canvas
    boxes: torch.Tensor # of size classes x N boxes x (dim of spatial shape * 2), and even include the batch dimension when relevant
    affine: torch.Tensor # affine acces through MetaObj
    def __torch_dispatch__(...): ...
    def __torch_function__(...): ...

Then I have some ideas on how this tensor could be used (from least feasible to best, to my opinion):

Each transform that would support boxes could test if the provided tensor is a BoundingBox instance and act in consequence
Add some resampling strategies directly to the MetaTensor class, which would help customize some monai behavior through inheritance
Use torch's dispatch mechanism to try and dispatch calls to monai resampling functions. then __torch_dispatch__ and __torch_function__ could overwrite those functions

If not and if all bounding box related computations must be in a separate subpackage, is it worth implementing bounding box versions of the transforms specified earlier?

Then pixdim (and affine?) should still be carried out somehow for those transforms to work. Also, is it possible to extract more machinery of the original transforms to separate functions/methods to avoid copies? Like here:

MONAI/monai/apps/detection/transforms/dictionary.py

Lines 601 to 613 in 59a7211

    
           # zoom image, copied from monai.transforms.spatial.dictionary.RandZoomd 
        
           for key, mode, padding_mode, align_corners in zip( 
        
               self.image_keys, self.mode, self.padding_mode, self.align_corners 
        
           ): 
        
               if self._do_transform: 
        
                   d[key] = self.rand_zoom( 
        
                       d[key], mode=mode, padding_mode=padding_mode, align_corners=align_corners, randomize=False 
        
                   ) 
        
               else: 
        
                   d[key] = convert_to_tensor(d[key], track_meta=get_track_meta()) 
        
               if get_track_meta(): 
        
                   xform = self.pop_transform(d[key], check=False) if self._do_transform else {} 
        
                   self.push_transform(d[key], extra_info=xform)

If these features are relevant to monai, I would be glad to work on it in the next few months!

otherwise, I will have to develop those bounding boxes for my project in a separate package, but this may break are new releases of monai as a lot of the logic I would need to use is embed in monai's transforms and I will need to copy them.

Again, if my point of view comes from a misunderstanding of the library, feel free to correct me!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bonding boxes as class labels #8002

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Bonding boxes as class labels #8002

nowtryz Aug 7, 2024

Context

Feature request

Replies: 0 comments

nowtryz
Aug 7, 2024