Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compute capability 6.x support #2635

Closed
wants to merge 11 commits into from

Conversation

jasonacox
Copy link
Contributor

@jasonacox jasonacox commented Jan 28, 2024

UPDATED 23-Apr-2024

This proposal adds a script pascal.sh which allows users to add Pascal GPU (e.g. GTX 1060, Tesla P100) to vLLM

The script:

  • Adds 6.0, 6.1 and 6.2 (compute capability) GPU architectures to the CMakeLists.txt and Dockerfile files

Run:

# Add Pascal Support
./pascal.sh

# You can now build from source with Pascal GPU support:
pip install -e .

# or build the Docker image with:
DOCKER_BUILDKIT=1 docker build . --target vllm-openai --tag vllm/vllm-openai

Notes:

  • Pascal architectures are still supported by latest CUDA.
  • I understand that adding 6.x support expands the wheel size beyond the limit so this provides a way for users of this older architecture to use vLLM.

Build and run example:

# Build From Source
git clone https://github.com/vllm-project/vllm.git
cd vllm
./pascal.sh
pip install -e .

# Run OpenAI API Compatible Server
python3 -m vllm.entrypoints.openai.api_server \
    --tensor-parallel-size 4 \
    --worker-use-ray \
    --host 0.0.0.0 \
    --port 8080  \
    --model mistralai/Mistral-7B-Instruct-v0.1 \
    --served-model-name mistralai/Mistral-7B-Instruct-v0.1 \
    --dtype float \
    --max-model-len 20000

Related: #963 #1284

Thank you for the great project!!! 🙏

This adds the 6.x architectures to the supported list but also presents a warning that capabilities < 7.0 are untested and may have issues.
Failed build based on yapf - updating to suggested format:
NVIDIA_SUPPORTED_ARCHS = {
    "6.0", "6.1", "6.2", "7.0", "7.5", "8.0", "8.6", "8.9", "9.0"
}
Copy link
Contributor

@cduk cduk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you delete some of the ROCM cards from the supported list?

@jasonacox
Copy link
Contributor Author

Why do you delete some of the ROCM cards from the supported list?

Hi @cduk - This PR doesn't touch the ROCM cards. It only adds the Nvidia Pascal (capability 6) cards. Here are the diffs:

image

...

image

@cduk
Copy link
Contributor

cduk commented Mar 18, 2024

Maybe I'm mistaken. I was looking at this diff:

image

@Fuckingnameless
Copy link

what's the status on this?

@nkuhn-vmw
Copy link

Hello - very interested in this PR - as I also am running multiple P40s and would like to use vLLM

@youkaichao
Copy link
Member

Thanks for the effort. You can keep this branch, but this PR is not necessary. Closed.

@youkaichao youkaichao closed this Apr 24, 2024
@jasonacox
Copy link
Contributor Author

No problem. Thanks @youkaichao

Permanent solution would be: #4290

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants