You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for trying our codes. From our experience, the vllm is sometimes indeed hard to install because it depends on xformers. See this post. I think you need to load the model as dtype half on P100 with vllm. One of my server which has the similar environment that could run vllm successfully has the following package: torch 2.0.1+cu118, vllm==0.1.6. Another server with latest vllm is: torch 2.1.0+cu121 vllm==0.2.0. You could try different cuda versions (no need to upgrade the server's cuda, it can be shipped with pytorch installation in the env) or vllm versions. Note that we have found some differences in outputs from vllm 0.1.6 and vllm 0.2.0. You could also try build from source if you meet some installation or running issue.
If all above solutions cannot work well, maybe you could still stick to hugging face which is easier to install but slow. Plz let us know if you still meet the installation or running issues.
hi, I got a problem when I was trying to use vllm.
nt PyTorch version is :
and my gpu machine is P100, Nvidia-driveris 470.141. could you please check this problem? thx
The text was updated successfully, but these errors were encountered: