You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The features extracted using SAM achieve only around 20 mIoU on fold 0 of COCO-20i. The SAM encoder with weak semantics performs poorly in complex scenes. Here are two reasons for this:
Poor feature matching: SAM's features fail to match multiple instances with similar semantics in complex scenes.
Poor semantic guidance: SAM cannot provide effective semantic guidance for ILM (Instance-Level Matching) to select high-quality mask proposals.
Have you try directly use SAM encoder to extract feature instead use other pretrained model?
The text was updated successfully, but these errors were encountered: