-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Hardware][Nvidia] Enable support for Pascal GPUs #4409
base: main
Are you sure you want to change the base?
Conversation
@youkaichao |
(From Release Tracker)
I've been using vLLM on my P40s every day for almost a month now, and everything works fine. Maybe the patched P.S. Whoever is reading this, you might want to check out my project, which has pre-built |
Does this mean I can't run vllm on a Tesla P4, Even a small model? |
@AslanEZ I believe the P4 has a compute capability of 6.1. This PR requests to add that. Have you tested? |
I have tested it by installing with pip. It didn't work.
I intend to try your code now. |
Oh, it works! Thank you! |
Could we get an update on the status of this PR? I've been eagerly awaiting it, as I can't use vllm until it supports my hardware. |
@dirkson it was answered here #6434 (comment) |
This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you! |
Not stale. Also, this PR only increases the wheel size by 10MB, so please consider. |
Wanted to express interest in pascal support too - thank you all for all your work on these projects |
[Hardware][Nvidia] Enable support for Pascal GPUs (sm_60, sm_61)
FIX: #963 #1284
Related: #4290 #2635
--
This is a new PR as a placeholder in the hope that the wheel size >100MB request is someday granted. This only adds compute capability 6.0 and 6.1. Note: pytorch is now only supporting sm_60.
Pascal Architecture
Example test on 4 x P100 GPUs on CUDA 12.2 system: