-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gauss kernel initialization: unknown error #55
Comments
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I sometimes get a similar error on an MBP when there are other programs running using GPU (typically Chrome). Closing the program(s) makes the system to switch to the integrated graphic card and then I am able use popsift. Probably a memory size limit? |
I get the same error:
According to nvidia-smi, only 600 MiB out of 8000 MiB on the GPU are occupied by other processes. |
thanks for the report. |
Just a few more data points. Out of curiosity, I commented out that memory copy. This lead to a similar error in uploading SIFT constants:
Commenting this one out, I hit the next one at:
|
Huh. You are getting different error messages on the K4000 and the RTX 2070. That's weird. Could you try to move the Another possibility, but I wouldn't know why that should happen if your system has only 1 CUDA card, is that cudaMemcpyToSymbol cannot figure out which card you are trying to use. The constant memory should exist on all CUDA cards anyway. That could be tested by adding a call |
The amount on constant memory on a CUDA card is quite limited, but all documentation insists that it is because the constant cache size is limited. Do you have any hints on how I can get recreate the error (on Linux)? |
Hi, thanks for your answer. I've tried both moving I did not mention before, I am running PopSift in a Docker container. This morning I tried to build and run it on the host system directly, and there were no issues.
Unfortunately, the Docker image I use is proprietary, so I can not share it. Instead, I've tried to create a minimal image based on |
Is it possible that your main Docker container uses a different CUDA SDK than the host machine, but your test container uses the same SDK as the host? Since late CUDA 10, NVidia tries to do something about the compatibility hassle (as they are writing here: https://docs.nvidia.com/deploy/cuda-compatibility/index.html), but I have not looked at those compatibility libraries at all. |
My "main" container is based on the
but as far as I understand, this is because the driver is tied to a certain CUDA version regardless of which CUDA SDK is actually installed. Also, according to the info on the page you posted, this driver is compatible with all 10.x versions. I'm currently trying to "bisect" the layers of the "main" container to find the one that introduces the problem. |
I found the cause and it (seemingly) has nothing to do with CUDA and/or Docker. In my "main" container In case you want to reproduce this and check what's going on, simply install set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -fuse-ld=lld") |
Does anyone have any idea where this error cames from?
The text was updated successfully, but these errors were encountered: