Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error "unable to create stream: the provided PTX was compiled with an unsupported toolchain" once installed other version of CUDA #5846

Closed
oujiafan opened this issue May 24, 2023 · 5 comments

Comments

@oujiafan
Copy link

oujiafan commented May 24, 2023

Description
In triton server 22.05 we want to use CUDA 11.4.
However, it will throw exception when loading model:
UNAVAILABLE: Internal: unable to create stream: the provided PTX was compiled with an unsupported toolchain.

Triton Information
NVIDIA Release 22.05
Triton Server Version 2.22.0

Host machine driver: 510.47.03
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+

Are you using the Triton container or did you build it yourself?
container

To Reproduce
Run docker nvcr.io/nvidia/tritonserver:22.05-py3.
Uninstall CUDA 11.7 and install CUDA 11.4 in the container
Run any model, we will get error.

UNAVAILABLE: Internal: unable to create stream: the provided PTX was compiled with an unsupported toolchain.

We cannot update driver version (510.47.03) on host machine. So we're not sure if it's caused by driver (web search indicated driver upgrade could resolve such issue).

Expected behavior
Able to switch CUDA version in container, and run model successfully.

@songkq
Copy link

songkq commented May 24, 2023

NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0 can be required.

@oujiafan
Copy link
Author

NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0 can be required.

Sorry we cannot update driver on the host machine.
In addition we don't understand why lower CUDA version (11.4) would require updating to higher driver version.

@dyastremsky
Copy link
Contributor

It's possible that there was an issue during the downgrade. In any case, we don't support this. You can find the 22.05 container requirements in the release notes here: https://docs.nvidia.com/deeplearning/triton-inference-server/release-notes/rel_22-05.html#rel_22-05. CUDA 11.7.0 is supported. This is consistent with the version of CUDA supported in 22.05 containers for all of NVIDIA optimized framework containers, including upstream dependencies.

You're trying to do something unsupported. You'll need to find an older Triton version (via the linked resources) that supports 11.4 or build it yourself and figure out the changes you need to make to dependencies to get it working on CUDA 11.4.

@krishung5
Copy link
Contributor

Closing issue due to inactivity. Please let us know if you would like this ticket reopened for follow-up.

@oniondai
Copy link

refrence: clearml/clearml-serving#29 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants