Skip to content

Commit

Permalink
[Bugfix] No num_gpus for ROCm and XPU when connecting to a ray cluster
Browse files Browse the repository at this point in the history
I don't find any reason for making a special case for ROCm and XPU,
as they should make no difference with CUDA or something else in this
case.

Error log:

  File "vllm/vllm/engine/llm_engine.py", line 528, in _get_executor_cls
    initialize_ray_cluster(engine_config.parallel_config)
  File "vllm/vllm/executor/ray_utils.py", line 230, in initialize_ray_cluster
    ray.init(address=ray_address,
  File "venv/lib/python3.11/site-packages/ray/_private/client_mode_hook.py",
   line 103, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "venv/lib/python3.11/site-packages/ray/_private/worker.py",
   line 1689, in init
    raise ValueError(
ValueError: When connecting to an existing cluster, num_cpus and num_gpus must not be provided.

Signed-off-by: Hollow Man <[email protected]>
  • Loading branch information
HollowMan6 committed Sep 24, 2024
1 parent 2467b64 commit ee0064a
Showing 1 changed file with 2 additions and 7 deletions.
9 changes: 2 additions & 7 deletions vllm/executor/ray_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
from vllm.logger import init_logger
from vllm.platforms import current_platform
from vllm.sequence import ExecuteModelRequest, IntermediateTensors
from vllm.utils import get_ip, is_hip, is_xpu
from vllm.utils import get_ip
from vllm.worker.worker_base import WorkerWrapperBase

logger = init_logger(__name__)
Expand Down Expand Up @@ -226,12 +226,7 @@ def initialize_ray_cluster(
assert_ray_available()

# Connect to a ray cluster.
if is_hip() or is_xpu():
ray.init(address=ray_address,
ignore_reinit_error=True,
num_gpus=parallel_config.world_size)
else:
ray.init(address=ray_address, ignore_reinit_error=True)
ray.init(address=ray_address, ignore_reinit_error=True)

if parallel_config.placement_group:
# Placement group is already set.
Expand Down

0 comments on commit ee0064a

Please sign in to comment.