-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
运行一会后报错 #28
Comments
遇到了同样的问题,请问楼主解决了么 |
我也遇到了同样的问题,请问有解决方法吗? |
I suggest adding a breakpoint before this line of code: https://github.com/fudan-zvg/Semantic-Segment-Anything/blob/main/scripts/pipeline.py#L95 . Check the shapes of patch_huge and mask_categories because the error message indicates a mismatch between the two. It is highly likely that the number of texts is 0 or too few. In such cases, additional preprocessing should be added before line 95 to handle this specific situation. |
I have encountered the same issue. The problem occurs in the clipseg_segmentation function in the Semantic-Segment-Anything/scripts/clipseg.py file. When class_list contains only one class, it appears as a string instead of a list. Additionally, when there is only a single class, the output of clipseg_model's logits will have a shape of (H, W), which causes dimension inconsistency when using F.interpolate for scaling. Therefore, some modifications are needed. Here is the modified code: def clipseg_segmentation(image, class_list, clipseg_processor, clipseg_model, rank):
if isinstance(class_list, str):
class_list = [class_list, ]
inputs = clipseg_processor(
text=class_list, images=[image] * len(class_list),
padding=True, return_tensors="pt").to(rank)
# resize inputs['pixel_values'] to the longest side of inputs['pixel_values']
h, w = inputs['pixel_values'].shape[-2:]
fixed_scale = (512, 512)
inputs['pixel_values'] = F.interpolate(
inputs['pixel_values'],
size=fixed_scale,
mode='bilinear',
align_corners=False)
outputs = clipseg_model(**inputs)
try:
logits = F.interpolate(outputs.logits[None], size=(h, w), mode='bilinear', align_corners=False)[0]
except Exception as e:
logits = F.interpolate(outputs.logits[None, None, ...], size=(h, w), mode='bilinear', align_corners=False)[0]
return logits This modification includes converting class_list from a string to a list when it is of type str. It also ensures consistent dimensions when using F.interpolate by handling the case when clipseg_model's logits have a shape of (H, W). |
能成功100多张图片,然后就会出现这样的报错停止。
使用命令如下
The text was updated successfully, but these errors were encountered: