Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instance Segmentation Mask/Bbox Relation #1784

Open
FrsECM opened this issue Jun 11, 2024 · 7 comments
Open

Instance Segmentation Mask/Bbox Relation #1784

FrsECM opened this issue Jun 11, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@FrsECM
Copy link

FrsECM commented Jun 11, 2024

Describe the bug

I work on a usecase of instance segmentation with torchvision. In this case, i have :

  • image
  • bboxes
  • masks
  • labels
    I'have created an augmentation set with albumentation, something like that :
def augmentations(instance_item:dict):
    transform = A.Compose([
        A.LongestMaxSize(max_size=MAX_SIZE),
        A.PadIfNeeded(
                min_height=MIN_IMG_HEIGHT,
                min_width=MIN_IMG_WIDTH,
                border_mode=cv2.BORDER_CONSTANT,
                value=0,
                always_apply=True),
        A.HorizontalFlip(),
        A.RandomCrop(MIN_IMG_HEIGHT,MIN_IMG_WIDTH),
        A.ToFloat(max_value=255),
        ToTensorV2()
    ],
        bbox_params=A.BboxParams(format='pascal_voc',label_fields=['labels'],min_visibility=BBOX_MIN_VISIBILITY),
        is_check_shapes=False
    )
    output = transform(
        image=instance_item['image'],
        masks=instance_item['masks'],
        bboxes=instance_item['boxes'],
        labels=instance_item['labels'])
    return output

I would expect that the parameter related to the visibility is applyable at the instance level.
I mean if the bbox is not enough visible, it should remove the corresponding mask.

I tried adding "masks" in the bbox_params :

A.BboxParams(format='pascal_voc',label_fields=['labels','masks'],min_visibility=BBOX_MIN_VISIBILITY),

but in that case it does not apply augmentation on the masks.

To Reproduce

In order to reproduce, you can use the code bellow :

import albumentations as A
import numpy as np

img = np.zeros((2048, 2048, 3), dtype=np.uint8)

# Format des bbox : [x_min, y_min, x_max, y_max]
bboxes = [
    [800, 800, 1200, 1200],  # bbox à l'intérieur du recadrage
    [1500, 1500, 1700, 1700] # bbox à l'extérieur du recadrage
]
labels = [0,1]

masks = [
    np.zeros((2048, 2048), dtype=np.uint8),
    np.zeros((2048, 2048), dtype=np.uint8)
]
masks[0][800:1200, 800:1200] = 1
masks[1][1500:1700, 1500:1700] = 1

aug = A.Compose(
    [
        A.CenterCrop(1024, 1024)
    ],
    bbox_params=A.BboxParams(format='pascal_voc', label_fields=['labels'],min_visibility=0.3),
)

# Apply Transformation
augmented = aug(image=img, bboxes=bboxes, masks=masks, labels=[0, 1])

# Get back results
print(len(augmented['bboxes']))
print(len(augmented['labels']))
print(len(augmented['masks']))
# Returns
1
1
2 => Should be 1

Expected behavior

I would expect that the parameter related to the visibility is applyable at the instance level.
I mean if the bbox is not enough visible, it should remove the corresponding mask.

Actual behavior

If i do not add masks to label_fields, i have inconsistency.
If i do add masks to label_fields, augmentations are not applyed to it.

@FrsECM FrsECM added the bug Something isn't working label Jun 11, 2024
@FrsECM
Copy link
Author

FrsECM commented Jun 11, 2024

A workarround i've find :

def iris2_training(segmentation_item:dict):
    transform = A.Compose([
        A.LongestMaxSize(max_size=MAX_SIZE),
        A.PadIfNeeded(
                min_height=MIN_IMG_HEIGHT,
                min_width=MIN_IMG_WIDTH,
                border_mode=cv2.BORDER_CONSTANT,
                value=0,
                always_apply=True),
        A.HorizontalFlip(),
        A.RandomCrop(MIN_IMG_HEIGHT,MIN_IMG_WIDTH),
        A.ToFloat(max_value=255),
        ToTensorV2()
    ],
        bbox_params=A.BboxParams(format='pascal_voc',label_fields=['labels','ids'],min_visibility=BBOX_MIN_VISIBILITY),
        is_check_shapes=False
    )
    output = transform(
        image=segmentation_item['image'],
        masks=segmentation_item['masks'],
        bboxes=segmentation_item['boxes'],
        labels=segmentation_item['labels'],
        ids=range(len(segmentation_item['labels']))
    )

    return dict(
        image=output['image'],
        boxes=output['bboxes'],
        labels=output['labels'],
        masks=[output['masks'][i] for i in output['ids']],
        name=segmentation_item['name']
        )

But it should be working without it.

@ternaus
Copy link
Collaborator

ternaus commented Jun 11, 2024

Thanks for the proposed solution!

Yep, we do have this issue that masks, boxes and keypoints and not binded on the instance level.

#1716

Your approach is the best that I have seen so far for this problem.

@simonebonato
Copy link

simonebonato commented Jul 4, 2024

Also came here with the same issue.

Thanks for the workaround people :) Although should be expected from such a library to have that by default or at least being able to add it

@ternaus
Copy link
Collaborator

ternaus commented Jul 4, 2024

@simonebonato how much would you be willing to donate to help to make this happen?

https://github.com/sponsors/albumentations-team

@simonebonato
Copy link

I can maybe try to solve it myself if I have time.
I suppose the code is already there since it's already working with the labels.

@FrsECM
Copy link
Author

FrsECM commented Jul 4, 2024

Just to keep you aware....
If you are doing instance segmentation, you should be carefull with using original bboxes.

image

Now, i just apply augmentation to masks and then recompute the bbox coordinates. For me it makes more sense because this way the bbox will match the final augmented mask.

To do this, you can use pycocotools.

import pycocotools.mask as mask_utils
import numpy as np

def mask_to_bbox(mask:np.ndarray)->np.ndarray:
    """Convert a mask to a bbox.
    it is usefull when we apply augmentation on a mask and we would like a precised bbox corresponding to 
    the transformed mask.

    Args:
        mask (np.ndarray): _description_
    """
    mask_rle = mask_utils.encode(np.asfortranarray(mask>0))
    bbox_xywh = mask_utils.toBbox(mask_rle)
    # We convert it to a bbox xyxy
    bbox_xyxy = (bbox_xywh+np.array([0,0,bbox_xywh[0],bbox_xywh[1]]))
    return bbox_xyxy

...
transform_output = transform(
        image=segmentation_item['image'],
        masks=segmentation_item['masks']
    )
transform_output['boxes']=[mask_to_bbox(mask) for mask in masks] 

@ternaus
Copy link
Collaborator

ternaus commented Jul 4, 2024

Yep, how to rotate bounding boxes so that boxes stay tight is an open question. Recomputing masks at the end was always the way to go.

We do have function for it

def bbox_from_mask(mask: np.ndarray) -> tuple[int, int, int, int]:

def bbox_from_mask(mask: np.ndarray) -> tuple[int, int, int, int]:
    """Create bounding box from binary mask (fast version)

    Args:
        mask (numpy.ndarray): binary mask.

    Returns:
        tuple: A bounding box tuple `(x_min, y_min, x_max, y_max)`.

    """
    rows = np.any(mask, axis=1)
    if not rows.any():
        return -1, -1, -1, -1
    cols = np.any(mask, axis=0)
    y_min, y_max = np.where(rows)[0][[0, -1]]
    x_min, x_max = np.where(cols)[0][[0, -1]]
    return x_min, y_min, x_max + 1, y_max + 1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants