Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with vLLM with tensor_parallel_size argument #805

Merged
merged 10 commits into from
Jul 23, 2024

Conversation

gabrielmbmb
Copy link
Member

@gabrielmbmb gabrielmbmb commented Jul 22, 2024

Description

This PR adds a few changes that enables using vLLM with the argument tensor_parallel_size. This argument enables the use of multiple GPUs with vLLM, and to do so, vLLM uses multiprocessing or ray:

  1. When using the multiprocessing approach, it was not working as vLLM was trying to create new processes but it was not able to because the process that distilabel creates was a daemon process, which is not allowed to create child processes. To by pass these issue, a _NoDaemonPool class has been created that creates non-daemon processes and it's used in Pipeline.
  2. With ray it works installing the version in the main branch which includes the changes of [Core] Introduce SPMD worker execution using Ray accelerated DAG vllm-project/vllm#6032. It's needed to set VLLM_USE_RAY_COMPILED_DAG=1 and VLLM_USE_RAY_SPMD_WORKER=1 environment variables. Update: vllm==0.5.3 has been released which includes the changes needed.

@gabrielmbmb gabrielmbmb added the enhancement New feature or request label Jul 22, 2024
@gabrielmbmb gabrielmbmb added this to the 1.3.0 milestone Jul 22, 2024
@gabrielmbmb gabrielmbmb requested a review from plaguss July 22, 2024 14:28
@gabrielmbmb gabrielmbmb self-assigned this Jul 22, 2024
Copy link

Documentation for this PR has been built. You can view it at: https://distilabel.argilla.io/pr-805/

Copy link

codspeed-hq bot commented Jul 22, 2024

CodSpeed Performance Report

Merging #805 will not alter performance

Comparing compatibility-vllm-tensor-parallel-size (c0ece53) with develop (ea1c44b)

Summary

✅ 1 untouched benchmarks

src/distilabel/pipeline/local.py Outdated Show resolved Hide resolved
@gabrielmbmb gabrielmbmb merged commit b7f124f into develop Jul 23, 2024
5 of 7 checks passed
@gabrielmbmb gabrielmbmb deleted the compatibility-vllm-tensor-parallel-size branch July 23, 2024 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants