Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for Python version of connected_components and Performance Comparison Inquiry #243

Open
Spark001 opened this issue Aug 20, 2024 · 2 comments

Comments

@Spark001
Copy link

Hi @ronghanghu

In the previous issues, many people mentioned the installation problems with _C, such as #53 #59 #22 , and I have encountered similar issues:

UserWarning: cannot import name '_C' from 'sam2' 

Skipping the post-processing step due to the error above. 
You can still use SAM 2 and it's OK to ignore the error above, 
although some post-processing functionality may be limited 
(which doesn't affect the results in most cases; see 
https://github.com/facebookresearch/segment-anything-2/blob/main/INSTALL.md).

Although it can run ignoring this warning, I still want to understand the impact of this post-processing on the results.

By observing the files in csrc, it was found that there is a function similar to the union-find algorithm called connected_components.
Could you provide a pure Python version of this function? This would be more installation-friendly, especially for people who are not convenient to update the CUDA driver : ).

Or is there a significant difference in time consumption between the CUDA-based implementation and the Python-based implementation? How much exactly is it? In which scenarios would there be a larger difference?

Looking forward to your reply, thank you very much.

@ronghanghu
Copy link
Contributor

ronghanghu commented Aug 20, 2024

Hi @Spark001, thanks for your interest in SAM 2. Regarding your questions:

Although it can run ignoring this warning, I still want to understand the impact of this post-processing on the results.

In most scenarios, this post-processing step doesn't make a notable difference. It is intended to cover the (relatively rare) case of removing (filling) some small holes in the output masks.

Could you provide a pure Python version of this function?

For a pure Python version, it's mostly similar to cv2.connectedComponentsWithStats in OpenCV (see https://www.geeksforgeeks.org/python-opencv-connected-component-labeling-and-analysis/) or skimage.measure.label in Scikit-Image (see https://scikit-image.org/docs/stable/api/skimage.measure.html#skimage.measure.label). For the latter, there is a community implementation of it in #216 that you may have a try with :)

Or is there a significant difference in time consumption between the CUDA-based implementation and the Python-based implementation? How much exactly is it? In which scenarios would there be a larger difference?

Yes, running this post-processing step is usually much slower on CPUs than using the CUDA-based implementation (and it could be a major overhead in video applications, so it's recommended to compile the CUDA kernel above for connected components or to just skip it, instead of running a CPU version of this op.

@Spark001
Copy link
Author

@ronghanghu Thanks for your quick reply 👍

I will try to find a way to compile it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants