You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For my current projet, I have and will have to work with bounding boxes a lot. I have boxes per classes and each class may have overlapping boxes.
I am still new to Monai and medical images in general, so please feel free to correct me if some of my questions/ideas come from a misunderstanding of either the librairies or the format of the images!
Context
To prepare data for the training, they have to pass through a collection of transformations combines in a Compose object. Such pipeline can be vulgarized like that:
Compose([
LoadImaged(keys=["image"], ...), # And somehow load 'box' as wellOrientationd(keys=["image", "box"], axcodes="RAS"),
Spacingd(keys=["image", "box"], ...),
ScaleIntensityRanged(keys=["image"], ...),
CropForegroundd(keys=["image", "box"], source_key="image", ...),
SpatialPadd(keys=["image", "box"], ...),
RandZoomd(keys=["image", "box"], ...),
RandCropByPosNegLabeld(keys=["image", "box"], label_key="box", image_key="image", ...),
RandRotate90d(keys=["image", "box"], ...),
RandShiftIntensityd(keys=["image"], ...),
# Then convert "box" to the desired format for the training
])
I used some weird tricks that did not really work then I took a look at the existing boxes functions I found:
Which include RandZoomBoxd, RandCropBoxByPosNegLabeld and RandRotateBox90d but not Orientationd, Spacingd, CropForegroundd, SpatialPadd, EnsureChannelFirstd.
Moreover, if I understand correctly, those boxes are custom made for detection, meaning I cannot store bounding boxes specific to classes, which is what I am looking for.
Feature request
**Would it be possible to have BoundingBox first class citizen like in torchvision but attached to classes? **
I am thinking of an nd Tensor that could support resampling and change of pixel dim:
classBoundingBox(MetaObj):
spatial_shape: torch.Size# used to infer the size of the canvasboxes: torch.Tensor# of size classes x N boxes x (dim of spatial shape * 2), and even include the batch dimension when relevantaffine: torch.Tensor# affine acces through MetaObjdef__torch_dispatch__(...): ...
def__torch_function__(...): ...
Then I have some ideas on how this tensor could be used (from least feasible to best, to my opinion):
Each transform that would support boxes could test if the provided tensor is a BoundingBox instance and act in consequence
Add some resampling strategies directly to the MetaTensor class, which would help customize some monai behavior through inheritance
Use torch's dispatch mechanism to try and dispatch calls to monai resampling functions. then __torch_dispatch__ and __torch_function__ could overwrite those functions
If not and if all bounding box related computations must be in a separate subpackage, is it worth implementing bounding box versions of the transforms specified earlier?
Then pixdim (and affine?) should still be carried out somehow for those transforms to work. Also, is it possible to extract more machinery of the original transforms to separate functions/methods to avoid copies? Like here:
If these features are relevant to monai, I would be glad to work on it in the next few months!
otherwise, I will have to develop those bounding boxes for my project in a separate package, but this may break are new releases of monai as a lot of the logic I would need to use is embed in monai's transforms and I will need to copy them.
Again, if my point of view comes from a misunderstanding of the library, feel free to correct me!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi,
For my current projet, I have and will have to work with bounding boxes a lot. I have boxes per classes and each class may have overlapping boxes.
I am still new to Monai and medical images in general, so please feel free to correct me if some of my questions/ideas come from a misunderstanding of either the librairies or the format of the images!
Context
To prepare data for the training, they have to pass through a collection of transformations combines in a
Compose
object. Such pipeline can be vulgarized like that:I used some weird tricks that did not really work then I took a look at the existing boxes functions I found:
monai.data.box_utils
monai.apps.detection.transforms.box_ops
andmonai.apps.detection.transforms.dictionary
Which include
RandZoomBoxd
,RandCropBoxByPosNegLabeld
andRandRotateBox90d
but notOrientationd
,Spacingd
,CropForegroundd
,SpatialPadd
,EnsureChannelFirstd
.Moreover, if I understand correctly, those boxes are custom made for detection, meaning I cannot store bounding boxes specific to classes, which is what I am looking for.
Feature request
**Would it be possible to have
BoundingBox
first class citizen like in torchvision but attached to classes? **I am thinking of an nd Tensor that could support resampling and change of pixel dim:
Then I have some ideas on how this tensor could be used (from least feasible to best, to my opinion):
BoundingBox
instance and act in consequenceMetaTensor
class, which would help customize some monai behavior through inheritance__torch_dispatch__
and__torch_function__
could overwrite those functionsIf not and if all bounding box related computations must be in a separate subpackage, is it worth implementing bounding box versions of the transforms specified earlier?
Then pixdim (and affine?) should still be carried out somehow for those transforms to work. Also, is it possible to extract more machinery of the original transforms to separate functions/methods to avoid copies? Like here:
MONAI/monai/apps/detection/transforms/dictionary.py
Lines 601 to 613 in 59a7211
If these features are relevant to monai, I would be glad to work on it in the next few months!
otherwise, I will have to develop those bounding boxes for my project in a separate package, but this may break are new releases of monai as a lot of the logic I would need to use is embed in monai's transforms and I will need to copy them.
Again, if my point of view comes from a misunderstanding of the library, feel free to correct me!
Beta Was this translation helpful? Give feedback.
All reactions