-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error "unable to create stream: the provided PTX was compiled with an unsupported toolchain" once installed other version of CUDA #5846
Comments
|
Sorry we cannot update driver on the host machine. |
It's possible that there was an issue during the downgrade. In any case, we don't support this. You can find the 22.05 container requirements in the release notes here: https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/rel_22-05.html#rel_22-05. CUDA 11.7.0 is supported. This is consistent with the version of CUDA supported in 22.05 containers for all of NVIDIA optimized framework containers, including upstream dependencies. You're trying to do something unsupported. You'll need to find an older Triton version (via the linked resources) that supports 11.4 or build it yourself and figure out the changes you need to make to dependencies to get it working on CUDA 11.4. |
Closing issue due to inactivity. Please let us know if you would like this ticket reopened for follow-up. |
refrence: clearml/clearml-serving#29 (comment) |
Description
In triton server 22.05 we want to use CUDA 11.4.
However, it will throw exception when loading model:
UNAVAILABLE: Internal: unable to create stream: the provided PTX was compiled with an unsupported toolchain.
Triton Information
NVIDIA Release 22.05
Triton Server Version 2.22.0
Host machine driver: 510.47.03
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
Are you using the Triton container or did you build it yourself?
container
To Reproduce
Run docker nvcr.io/nvidia/tritonserver:22.05-py3.
Uninstall CUDA 11.7 and install CUDA 11.4 in the container
Run any model, we will get error.
UNAVAILABLE: Internal: unable to create stream: the provided PTX was compiled with an unsupported toolchain.
We cannot update driver version (510.47.03) on host machine. So we're not sure if it's caused by driver (web search indicated driver upgrade could resolve such issue).
Expected behavior
Able to switch CUDA version in container, and run model successfully.
The text was updated successfully, but these errors were encountered: