Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip install fails with "No CUDA runtime is found" despite existing CUDA toolkit installation #90

Closed
mmbannert opened this issue Feb 3, 2023 · 1 comment

Comments

@mmbannert
Copy link

Hi,

I have been struggling for a while with the installation of the "spatial-correlation-sampler". I am trying to install it using pip within a Docker image. I am using an Ubuntu base image from NVIDIA. Its CUDA version matches with the one on the host. import torch works as expected and torch.version.cuda returns the correct version. torch.cuda.is_available() returns True.

Why is it telling me that No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'? See below for entire Docker output.

Typing which nvcc in the container gives me /usr/local/cuda/bin/nvcc. This is why I expect that location to be correct.

What am I missing?

Thanks,
Michael

PS:
Python 3.10.9
torch 1.13.1+cu116
gcc 9.4.0

[+] Building 103.1s (15/17)                                                                             
 => [internal] load build definition from Dockerfile                                               0.0s
 => => transferring dockerfile: 3.44kB                                                             0.0s
 => [internal] load .dockerignore                                                                  0.0s
 => => transferring context: 2B                                                                    0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:11.6.0-devel-ubuntu20.04                    0.4s
 => [ 1/14] FROM docker.io/nvidia/cuda:11.6.0-devel-ubuntu20.04@sha256:989dad82ff8417dcb1d70f6329  0.0s
 => CACHED [ 2/14] RUN ln -snf /usr/share/zoneinfo/Europe/Berlin /etc/localtime && echo Europe/Be  0.0s
 => CACHED [ 3/14] RUN apt-get update &&     apt-get install -yq tzdata &&     ln -fs /usr/share/  0.0s
 => CACHED [ 4/14] RUN apt-get update && apt-get install -y zip unzip tree wget build-essential n  0.0s
 => CACHED [ 5/14] RUN apt-get install -y ffmpeg libsm6 libxext6                                   0.0s
 => CACHED [ 6/14] RUN wget --quiet https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x  0.0s
 => CACHED [ 7/14] RUN conda create -n cv jupyter                                                  0.0s
 => [ 8/14] RUN echo "source activate cv" > ~/.bashrc                                              0.5s
 => [ 9/14] RUN pip3 install torch torchvision torchaudio --extra-index-url https://download.pyt  83.2s
 => [10/14] RUN pip3 install pytorch-lightning torchmetrics tensorboardX                           6.8s
 => [11/14] RUN pip3 install wheel                                                                 1.8s 
 => ERROR [12/14] RUN pip3 install spatial-correlation-sampler                                    10.2s 
------                                                                                                  
 > [12/14] RUN pip3 install spatial-correlation-sampler:                                                
#0 1.016 Collecting spatial-correlation-sampler                                                         
#0 1.088   Downloading spatial_correlation_sampler-0.4.0.tar.gz (9.3 kB)                                
#0 1.115   Preparing metadata (setup.py): started                                                       
#0 3.113   Preparing metadata (setup.py): finished with status 'done'                                   
#0 3.126 Requirement already satisfied: torch>=1.1 in /opt/conda/envs/cv/lib/python3.10/site-packages (from spatial-correlation-sampler) (1.13.1+cu116)
#0 3.127 Requirement already satisfied: numpy in /opt/conda/envs/cv/lib/python3.10/site-packages (from spatial-correlation-sampler) (1.24.1)
#0 3.134 Requirement already satisfied: typing-extensions in /opt/conda/envs/cv/lib/python3.10/site-packages (from torch>=1.1->spatial-correlation-sampler) (4.4.0)
#0 3.137 Building wheels for collected packages: spatial-correlation-sampler
#0 3.138   Building wheel for spatial-correlation-sampler (setup.py): started
#0 5.156   Building wheel for spatial-correlation-sampler (setup.py): finished with status 'error'
#0 5.167   error: subprocess-exited-with-error
#0 5.167   
#0 5.167   × python setup.py bdist_wheel did not run successfully.
#0 5.167   \u2502 exit code: 1
#0 5.167   \u2570\u2500> [69 lines of output]
#0 5.167       No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
#0 5.167       running bdist_wheel
#0 5.167       running build
#0 5.167       running build_py
#0 5.167       creating build
#0 5.167       creating build/lib.linux-x86_64-cpython-310
#0 5.167       creating build/lib.linux-x86_64-cpython-310/spatial_correlation_sampler
#0 5.167       copying Correlation_Module/spatial_correlation_sampler/__init__.py -> build/lib.linux-x86_64-cpython-310/spatial_correlation_sampler
#0 5.167       copying Correlation_Module/spatial_correlation_sampler/spatial_correlation_sampler.py -> build/lib.linux-x86_64-cpython-310/spatial_correlation_sampler
#0 5.167       running build_ext
#0 5.167       building 'spatial_correlation_sampler_backend' extension
#0 5.167       creating /tmp/pip-install-kco1k98t/spatial-correlation-sampler_7a7941541f2e4ddb86d2ebcb50cd03ef/build/temp.linux-x86_64-cpython-310
#0 5.167       creating /tmp/pip-install-kco1k98t/spatial-correlation-sampler_7a7941541f2e4ddb86d2ebcb50cd03ef/build/temp.linux-x86_64-cpython-310/Correlation_Module
#0 5.167       Traceback (most recent call last):
#0 5.167         File "<string>", line 2, in <module>
#0 5.167         File "<pip-setuptools-caller>", line 34, in <module>
#0 5.167         File "/tmp/pip-install-kco1k98t/spatial-correlation-sampler_7a7941541f2e4ddb86d2ebcb50cd03ef/setup.py", line 57, in <module>
#0 5.167           launch_setup()
#0 5.167         File "/tmp/pip-install-kco1k98t/spatial-correlation-sampler_7a7941541f2e4ddb86d2ebcb50cd03ef/setup.py", line 25, in launch_setup
#0 5.167           setup(
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/__init__.py", line 87, in setup
#0 5.167           return distutils.core.setup(**attrs)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
#0 5.167           return run_commands(dist)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
#0 5.167           dist.run_commands()
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
#0 5.167           self.run_command(cmd)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/dist.py", line 1208, in run_command
#0 5.167           super().run_command(command)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
#0 5.167           cmd_obj.run()
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 299, in run
#0 5.167           self.run_command('build')
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
#0 5.167           self.distribution.run_command(command)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/dist.py", line 1208, in run_command
#0 5.167           super().run_command(command)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
#0 5.167           cmd_obj.run()
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run
#0 5.167           self.run_command(cmd_name)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
#0 5.167           self.distribution.run_command(command)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/dist.py", line 1208, in run_command
#0 5.167           super().run_command(command)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
#0 5.167           cmd_obj.run()
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 84, in run
#0 5.167           _build_ext.run(self)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
#0 5.167           self.build_extensions()
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions
#0 5.167           build_ext.build_extensions(self)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 468, in build_extensions
#0 5.167           self._build_extensions_serial()
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 494, in _build_extensions_serial
#0 5.167           self.build_extension(ext)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
#0 5.167           _build_ext.build_extension(self, ext)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 549, in build_extension
#0 5.167           objects = self.compiler.compile(
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 649, in unix_wrap_ninja_compile
#0 5.167           cuda_post_cflags = unix_cuda_flags(cuda_post_cflags)
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 548, in unix_cuda_flags
#0 5.167           cflags + _get_cuda_arch_flags(cflags))
#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1780, in _get_cuda_arch_flags
#0 5.167           arch_list[-1] += '+PTX'
#0 5.167       IndexError: list index out of range
#0 5.167       [end of output]
#0 5.167   
#0 5.167   note: This error originates from a subprocess, and is likely not a problem with pip.
#0 5.167   ERROR: Failed building wheel for spatial-correlation-sampler
#0 5.168   Running setup.py clean for spatial-correlation-sampler
#0 7.166 Failed to build spatial-correlation-sampler
#0 7.948 Installing collected packages: spatial-correlation-sampler
#0 7.951   Running setup.py install for spatial-correlation-sampler: started
#0 9.974   Running setup.py install for spatial-correlation-sampler: finished with status 'error'
#0 9.995   error: subprocess-exited-with-error
#0 9.995   
#0 9.995   × Running setup.py install for spatial-correlation-sampler did not run successfully.
#0 9.995   \u2502 exit code: 1
#0 9.995   \u2570\u2500> [73 lines of output]
#0 9.995       No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'
#0 9.995       running install
#0 9.995       /opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
#0 9.995         warnings.warn(
#0 9.995       running build
#0 9.995       running build_py
#0 9.995       creating build
#0 9.995       creating build/lib.linux-x86_64-cpython-310
#0 9.995       creating build/lib.linux-x86_64-cpython-310/spatial_correlation_sampler
#0 9.995       copying Correlation_Module/spatial_correlation_sampler/__init__.py -> build/lib.linux-x86_64-cpython-310/spatial_correlation_sampler
#0 9.995       copying Correlation_Module/spatial_correlation_sampler/spatial_correlation_sampler.py -> build/lib.linux-x86_64-cpython-310/spatial_correlation_sampler
#0 9.995       running build_ext
#0 9.995       building 'spatial_correlation_sampler_backend' extension
#0 9.995       creating /tmp/pip-install-kco1k98t/spatial-correlation-sampler_7a7941541f2e4ddb86d2ebcb50cd03ef/build/temp.linux-x86_64-cpython-310
#0 9.995       creating /tmp/pip-install-kco1k98t/spatial-correlation-sampler_7a7941541f2e4ddb86d2ebcb50cd03ef/build/temp.linux-x86_64-cpython-310/Correlation_Module
#0 9.995       Traceback (most recent call last):
#0 9.995         File "<string>", line 2, in <module>
#0 9.995         File "<pip-setuptools-caller>", line 34, in <module>
#0 9.995         File "/tmp/pip-install-kco1k98t/spatial-correlation-sampler_7a7941541f2e4ddb86d2ebcb50cd03ef/setup.py", line 57, in <module>
#0 9.995           launch_setup()
#0 9.995         File "/tmp/pip-install-kco1k98t/spatial-correlation-sampler_7a7941541f2e4ddb86d2ebcb50cd03ef/setup.py", line 25, in launch_setup
#0 9.995           setup(
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/__init__.py", line 87, in setup
#0 9.995           return distutils.core.setup(**attrs)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
#0 9.995           return run_commands(dist)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
#0 9.995           dist.run_commands()
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
#0 9.995           self.run_command(cmd)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/dist.py", line 1208, in run_command
#0 9.995           super().run_command(command)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
#0 9.995           cmd_obj.run()
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/command/install.py", line 68, in run
#0 9.995           return orig.install.run(self)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/command/install.py", line 698, in run
#0 9.995           self.run_command('build')
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
#0 9.995           self.distribution.run_command(command)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/dist.py", line 1208, in run_command
#0 9.995           super().run_command(command)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
#0 9.995           cmd_obj.run()
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run
#0 9.995           self.run_command(cmd_name)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
#0 9.995           self.distribution.run_command(command)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/dist.py", line 1208, in run_command
#0 9.995           super().run_command(command)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
#0 9.995           cmd_obj.run()
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 84, in run
#0 9.995           _build_ext.run(self)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
#0 9.995           self.build_extensions()
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 843, in build_extensions
#0 9.995           build_ext.build_extensions(self)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 468, in build_extensions
#0 9.995           self._build_extensions_serial()
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 494, in _build_extensions_serial
#0 9.995           self.build_extension(ext)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
#0 9.995           _build_ext.build_extension(self, ext)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 549, in build_extension
#0 9.995           objects = self.compiler.compile(
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 649, in unix_wrap_ninja_compile
#0 9.995           cuda_post_cflags = unix_cuda_flags(cuda_post_cflags)
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 548, in unix_cuda_flags
#0 9.995           cflags + _get_cuda_arch_flags(cflags))
#0 9.995         File "/opt/conda/envs/cv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1780, in _get_cuda_arch_flags
#0 9.995           arch_list[-1] += '+PTX'
#0 9.995       IndexError: list index out of range
#0 9.995       [end of output]
#0 9.995   
#0 9.995   note: This error originates from a subprocess, and is likely not a problem with pip.
#0 10.00 error: legacy-install-failure
#0 10.00 
#0 10.00 × Encountered error while trying to install package.
#0 10.00 \u2570\u2500> spatial-correlation-sampler
#0 10.00 
#0 10.00 note: This is an issue with the package mentioned above, not pip.
#0 10.00 hint: See above for output from the failure.
------
failed to solve: executor failed running [/bin/sh -c pip3 install spatial-correlation-sampler]: exit code: 1
@mmbannert
Copy link
Author

I have had a little thought about this and it turned out that this was the critical error message:

#0 5.167         File "/opt/conda/envs/cv/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1780, in _get_cuda_arch_flags
#0 5.167           arch_list[-1] += '+PTX'
#0 5.167       IndexError: list index out of range

So this and this comment helped me solve the problem.

To be more precise: I determined the compute capability (CC) of the graphics card I'm intending to use by checking torch.cuda.get_device_capability(). It returned (8, 6). I therefore included ENV TORCH_CUDA_ARCH_LIST 8.6+PTX in my Dockerfile, which made it possible to build the wheel for the spatial-correlation-sampler as a consequence.

(I'm also using the same image on a different and older system where the CC is 7.5 and the Docker image build including the spatial-correlation-sampler worked there as well. So, perhaps the exact CC does not even matter so much.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant