vllm could not be used because of CUDA kernal #21

652994331 · 2023-11-18T06:45:29Z

hi, I got a problem when I was trying to use vllm.

nt PyTorch version is :

and my gpu machine is P100, Nvidia-driveris 470.141. could you please check this problem? thx

sherdencooper · 2023-11-18T21:16:22Z

Hi, thanks for trying our codes. From our experience, the vllm is sometimes indeed hard to install because it depends on xformers. See this post. I think you need to load the model as dtype half on P100 with vllm. One of my server which has the similar environment that could run vllm successfully has the following package: torch 2.0.1+cu118, vllm==0.1.6. Another server with latest vllm is: torch 2.1.0+cu121 vllm==0.2.0. You could try different cuda versions (no need to upgrade the server's cuda, it can be shipped with pytorch installation in the env) or vllm versions. Note that we have found some differences in outputs from vllm 0.1.6 and vllm 0.2.0. You could also try build from source if you meet some installation or running issue.

If all above solutions cannot work well, maybe you could still stick to hugging face which is easier to install but slow. Plz let us know if you still meet the installation or running issues.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vllm could not be used because of CUDA kernal #21

vllm could not be used because of CUDA kernal #21

652994331 commented Nov 18, 2023

sherdencooper commented Nov 18, 2023

vllm could not be used because of CUDA kernal #21

vllm could not be used because of CUDA kernal #21

Comments

652994331 commented Nov 18, 2023

sherdencooper commented Nov 18, 2023