Add ray[default] to wget to run distributed inference out of box #11265

Jeffwan · 2024-12-17T18:12:17Z

Why do we need such PR?

User wants to leverage vllm default image to manage RayCluster version instead of managing separate ray images and vLLM distributions which frequently surface version compatibility problems
ray[default] package provides more capabilities like submission api and dashboard which is essential to run vLLM distributed inference in cluster environment
wget is used in kubeRay prob.https://github.com/ray-project/kuberay/blob/e595ee4c6297fb6b385421f7ca34fbd7c1c0b49f/ray-operator/controllers/ray/common/pod.go#L253 and https://github.com/ray-project/kuberay/blob/e595ee4c6297fb6b385421f7ca34fbd7c1c0b49f/ray-operator/controllers/ray/utils/constant.go#L178. This can be definitely changed to curl but I feel adding a new package won't take that many room here.
Adding wget and ray[default] won't be risky. I notice some other files already include this change.

vllm/requirements-tpu.txt

Line 11 in f9ecbb1

ray[default]
vllm/Dockerfile.arm

Line 11 in f9ecbb1

&& apt-get install -y curl ccache git wget vim numactl gcc-12 g++-12 python3 python3-pip libtcmalloc-minimal4 libnuma-dev \

FIX

github-actions · 2024-12-17T18:12:30Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

…de case This helps user to run vLLM with ray distributed executor using vLLM default image out of box. Signed-off-by: Jiaxin Shan <[email protected]>

Jeffwan · 2024-12-18T00:06:48Z

this change is not directly related to above issue. I did verify the testing on my end and it's working fine.

vllm/tests/entrypoints/openai/test_chat.py

Lines 469 to 503 in 2d1b9ba

    
           async def test_guided_choice_chat(client: openai.AsyncOpenAI, 
        
                                             guided_decoding_backend: str, 
        
                                             sample_guided_choice): 
        
               messages = [{ 
        
                   "role": "system", 
        
                   "content": "you are a helpful assistant" 
        
               }, { 
        
                   "role": 
        
                   "user", 
        
                   "content": 
        
                   "The best language for type-safe systems programming is " 
        
               }] 
        
               chat_completion = await client.chat.completions.create( 
        
                   model=MODEL_NAME, 
        
                   messages=messages, 
        
                   max_completion_tokens=10, 
        
                   extra_body=dict(guided_choice=sample_guided_choice, 
        
                                   guided_decoding_backend=guided_decoding_backend)) 
        
               choice1 = chat_completion.choices[0].message.content 
        
               assert choice1 in sample_guided_choice 
        
               messages.append({"role": "assistant", "content": choice1}) 
        
               messages.append({ 
        
                   "role": "user", 
        
                   "content": "I disagree, pick another one" 
        
               }) 
        
               chat_completion = await client.chat.completions.create( 
        
                   model=MODEL_NAME, 
        
                   messages=messages, 
        
                   max_completion_tokens=10, 
        
                   extra_body=dict(guided_choice=sample_guided_choice, 
        
                                   guided_decoding_backend=guided_decoding_backend)) 
        
               choice2 = chat_completion.choices[0].message.content 
        
               assert choice2 in sample_guided_choice 
        
               assert choice1 != choice2

…m-project#11265) Signed-off-by: Jiaxin Shan <[email protected]> Signed-off-by: lucast2021 <[email protected]>

…m-project#11265) Signed-off-by: Jiaxin Shan <[email protected]>

mergify bot added the ci/build label Dec 17, 2024

Add ray[default] and wget to better support ray executor for cross no…

a4328a4

…de case This helps user to run vLLM with ray distributed executor using vLLM default image out of box. Signed-off-by: Jiaxin Shan <[email protected]>

Jeffwan force-pushed the jiaxin/add-ray-related-libs branch from 2a7d504 to a4328a4 Compare December 17, 2024 18:14

simon-mo approved these changes Dec 20, 2024

View reviewed changes

simon-mo merged commit 47a0b61 into vllm-project:main Dec 20, 2024
20 of 22 checks passed

Jeffwan deleted the jiaxin/add-ray-related-libs branch December 20, 2024 23:34

BKitor pushed a commit to BKitor/vllm that referenced this pull request Dec 30, 2024

Add ray[default] to wget to run distributed inference out of box (vll…

b209ca6

…m-project#11265) Signed-off-by: Jiaxin Shan <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ray[default] to wget to run distributed inference out of box #11265

Add ray[default] to wget to run distributed inference out of box #11265

Jeffwan commented Dec 17, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Dec 17, 2024

Jeffwan commented Dec 18, 2024

Add ray[default] to wget to run distributed inference out of box #11265

Add ray[default] to wget to run distributed inference out of box #11265

Conversation

Jeffwan commented Dec 17, 2024 • edited by github-actions bot Loading

github-actions bot commented Dec 17, 2024

Jeffwan commented Dec 18, 2024

Jeffwan commented Dec 17, 2024 •

edited by github-actions bot

Loading