Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Update description of vLLM support for CPUs #6003

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ vLLM is flexible and easy to use with:
- Tensor parallelism support for distributed inference
- Streaming outputs
- OpenAI-compatible API server
- Support NVIDIA GPUs, AMD GPUs, Intel CPUs and GPUs
- Support NVIDIA GPUs, AMD CPUs and GPUs, Intel CPUs and GPUs, PowerPC CPUs
- (Experimental) Prefix caching support
- (Experimental) Multi-lora support

Expand Down
2 changes: 1 addition & 1 deletion docs/source/getting_started/cpu-installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Requirements

* OS: Linux
* Compiler: gcc/g++>=12.3.0 (optional, recommended)
* Instruction set architecture (ISA) requirement: AVX512 is required.
* Instruction set architecture (ISA) requirement: for x86, AVX2 is required; for PowerPC, Power9+ is required.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the AVX2 and PowerPC backends don't have active testing and this documentation assumes you are building on a machine with AVX512, I'm a little hesitant to update in this doc. Maybe if you call out specifically where you can get directions for AVX2 and PowerPC, and that this doc is assuming AVX512, that would be more clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mgoin for the review.

The origin description AVX512 is required is quite misleading, which means vLLM requires AVX512 ISA to run.
It would limit the usage of vLLM if people wrongly get the msg that vLLM only supports AVX512.

Actually, we've built and run vLLM on AVX2-only machines.
So I would suggest removing AVX512 is required here.

However, if you're against adding the PowerPC CPUs here, I'm fine.
This is because we don't have PowerPC CPUs and don't test it at all.

Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the AVX2 and PowerPC backends don't have active testing and this documentation assumes you are building on a machine with AVX512, I'm a little hesitant to update in this doc. Maybe if you call out specifically where you can get directions for AVX2 and PowerPC, and that this doc is assuming AVX512, that would be more clear.

Hi @mgoin , I kept AVX512 in the doc.
But make it to be (optional, recommended) as the description of the compiler.
What do you think?
Thanks.


.. _cpu_backend_quick_start_dockerfile:

Expand Down
Loading