forked from vllm-project/vllm
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Installation]: Building wheel for vllm (pyproject.toml) did not run successfully #13
Open
1 task done
Comments
termial shows below Building wheels for collected packages: vllm
Building wheel for vllm (pyproject.toml) ... error
error: subprocess-exited-with-error
× Building wheel for vllm (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [587 lines of output]
/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/torch/_subclasses/functional_tensor.py:258: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
cpu = _conversion_method_template(device=torch.device("cpu"))
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-311
creating build/lib.linux-x86_64-cpython-311/vllm
copying vllm/sequence.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/config.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/block.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/scalar_type.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/version.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/envs.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/sampling_params.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/pooling_params.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/outputs.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/_custom_ops.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/commit_id.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/logger.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/connections.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/scripts.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/utils.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/tracing.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/_ipex_ops.py -> build/lib.linux-x86_64-cpython-311/vllm
copying vllm/_core_ext.py -> build/lib.linux-x86_64-cpython-311/vllm
creating build/lib.linux-x86_64-cpython-311/vllm/transformers_utils
copying vllm/transformers_utils/detokenizer.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils
copying vllm/transformers_utils/config.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils
copying vllm/transformers_utils/tokenizer.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils
copying vllm/transformers_utils/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils
copying vllm/transformers_utils/image_processor.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils
creating build/lib.linux-x86_64-cpython-311/vllm/lora
copying vllm/lora/lora.py -> build/lib.linux-x86_64-cpython-311/vllm/lora
copying vllm/lora/layers.py -> build/lib.linux-x86_64-cpython-311/vllm/lora
copying vllm/lora/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/lora
copying vllm/lora/worker_manager.py -> build/lib.linux-x86_64-cpython-311/vllm/lora
copying vllm/lora/fully_sharded_layers.py -> build/lib.linux-x86_64-cpython-311/vllm/lora
copying vllm/lora/request.py -> build/lib.linux-x86_64-cpython-311/vllm/lora
copying vllm/lora/models.py -> build/lib.linux-x86_64-cpython-311/vllm/lora
copying vllm/lora/utils.py -> build/lib.linux-x86_64-cpython-311/vllm/lora
copying vllm/lora/punica.py -> build/lib.linux-x86_64-cpython-311/vllm/lora
creating build/lib.linux-x86_64-cpython-311/vllm/prompt_adapter
copying vllm/prompt_adapter/layers.py -> build/lib.linux-x86_64-cpython-311/vllm/prompt_adapter
copying vllm/prompt_adapter/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/prompt_adapter
copying vllm/prompt_adapter/worker_manager.py -> build/lib.linux-x86_64-cpython-311/vllm/prompt_adapter
copying vllm/prompt_adapter/request.py -> build/lib.linux-x86_64-cpython-311/vllm/prompt_adapter
copying vllm/prompt_adapter/models.py -> build/lib.linux-x86_64-cpython-311/vllm/prompt_adapter
creating build/lib.linux-x86_64-cpython-311/vllm/multimodal
copying vllm/multimodal/base.py -> build/lib.linux-x86_64-cpython-311/vllm/multimodal
copying vllm/multimodal/image.py -> build/lib.linux-x86_64-cpython-311/vllm/multimodal
copying vllm/multimodal/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/multimodal
copying vllm/multimodal/registry.py -> build/lib.linux-x86_64-cpython-311/vllm/multimodal
copying vllm/multimodal/utils.py -> build/lib.linux-x86_64-cpython-311/vllm/multimodal
creating build/lib.linux-x86_64-cpython-311/vllm/attention
copying vllm/attention/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/attention
copying vllm/attention/layer.py -> build/lib.linux-x86_64-cpython-311/vllm/attention
copying vllm/attention/selector.py -> build/lib.linux-x86_64-cpython-311/vllm/attention
creating build/lib.linux-x86_64-cpython-311/vllm/distributed
copying vllm/distributed/communication_op.py -> build/lib.linux-x86_64-cpython-311/vllm/distributed
copying vllm/distributed/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/distributed
copying vllm/distributed/parallel_state.py -> build/lib.linux-x86_64-cpython-311/vllm/distributed
copying vllm/distributed/utils.py -> build/lib.linux-x86_64-cpython-311/vllm/distributed
creating build/lib.linux-x86_64-cpython-311/vllm/usage
copying vllm/usage/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/usage
copying vllm/usage/usage_lib.py -> build/lib.linux-x86_64-cpython-311/vllm/usage
creating build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/gpu_executor.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/openvino_executor.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/multiproc_gpu_executor.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/multiproc_worker_utils.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/executor_base.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/tpu_executor.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/neuron_executor.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/xpu_executor.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/cpu_executor.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/ray_utils.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/ray_gpu_executor.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/distributed_gpu_executor.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/ray_tpu_executor.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
copying vllm/executor/ray_xpu_executor.py -> build/lib.linux-x86_64-cpython-311/vllm/executor
creating build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/target_model_runner.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/spec_decode_worker.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/util.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/smaller_tp_proposer_worker.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/mlp_speculator_worker.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/metrics.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/multi_step_worker.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/interfaces.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/top1_proposer.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/draft_model_runner.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/medusa_worker.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/batch_expansion.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/proposer_worker_base.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
copying vllm/spec_decode/ngram_worker.py -> build/lib.linux-x86_64-cpython-311/vllm/spec_decode
creating build/lib.linux-x86_64-cpython-311/vllm/adapter_commons
copying vllm/adapter_commons/layers.py -> build/lib.linux-x86_64-cpython-311/vllm/adapter_commons
copying vllm/adapter_commons/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/adapter_commons
copying vllm/adapter_commons/worker_manager.py -> build/lib.linux-x86_64-cpython-311/vllm/adapter_commons
copying vllm/adapter_commons/request.py -> build/lib.linux-x86_64-cpython-311/vllm/adapter_commons
copying vllm/adapter_commons/models.py -> build/lib.linux-x86_64-cpython-311/vllm/adapter_commons
copying vllm/adapter_commons/utils.py -> build/lib.linux-x86_64-cpython-311/vllm/adapter_commons
creating build/lib.linux-x86_64-cpython-311/vllm/entrypoints
copying vllm/entrypoints/launcher.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints
copying vllm/entrypoints/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints
copying vllm/entrypoints/logger.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints
copying vllm/entrypoints/llm.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints
copying vllm/entrypoints/api_server.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints
copying vllm/entrypoints/chat_utils.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints
creating build/lib.linux-x86_64-cpython-311/vllm/model_executor
copying vllm/model_executor/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor
copying vllm/model_executor/custom_op.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor
copying vllm/model_executor/sampling_metadata.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor
copying vllm/model_executor/utils.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor
copying vllm/model_executor/pooling_metadata.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor
creating build/lib.linux-x86_64-cpython-311/vllm/assets
copying vllm/assets/base.py -> build/lib.linux-x86_64-cpython-311/vllm/assets
copying vllm/assets/image.py -> build/lib.linux-x86_64-cpython-311/vllm/assets
copying vllm/assets/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/assets
creating build/lib.linux-x86_64-cpython-311/vllm/core
copying vllm/core/scheduler.py -> build/lib.linux-x86_64-cpython-311/vllm/core
copying vllm/core/block_manager_v1.py -> build/lib.linux-x86_64-cpython-311/vllm/core
copying vllm/core/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/core
copying vllm/core/embedding_model_block_manager.py -> build/lib.linux-x86_64-cpython-311/vllm/core
copying vllm/core/interfaces.py -> build/lib.linux-x86_64-cpython-311/vllm/core
copying vllm/core/evictor_v2.py -> build/lib.linux-x86_64-cpython-311/vllm/core
copying vllm/core/evictor_v1.py -> build/lib.linux-x86_64-cpython-311/vllm/core
copying vllm/core/block_manager_v2.py -> build/lib.linux-x86_64-cpython-311/vllm/core
creating build/lib.linux-x86_64-cpython-311/vllm/engine
copying vllm/engine/protocol.py -> build/lib.linux-x86_64-cpython-311/vllm/engine
copying vllm/engine/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/engine
copying vllm/engine/metrics.py -> build/lib.linux-x86_64-cpython-311/vllm/engine
copying vllm/engine/arg_utils.py -> build/lib.linux-x86_64-cpython-311/vllm/engine
copying vllm/engine/async_llm_engine.py -> build/lib.linux-x86_64-cpython-311/vllm/engine
copying vllm/engine/async_timeout.py -> build/lib.linux-x86_64-cpython-311/vllm/engine
copying vllm/engine/llm_engine.py -> build/lib.linux-x86_64-cpython-311/vllm/engine
creating build/lib.linux-x86_64-cpython-311/vllm/inputs
copying vllm/inputs/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/inputs
copying vllm/inputs/registry.py -> build/lib.linux-x86_64-cpython-311/vllm/inputs
copying vllm/inputs/data.py -> build/lib.linux-x86_64-cpython-311/vllm/inputs
creating build/lib.linux-x86_64-cpython-311/vllm/platforms
copying vllm/platforms/cuda.py -> build/lib.linux-x86_64-cpython-311/vllm/platforms
copying vllm/platforms/interface.py -> build/lib.linux-x86_64-cpython-311/vllm/platforms
copying vllm/platforms/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/platforms
copying vllm/platforms/tpu.py -> build/lib.linux-x86_64-cpython-311/vllm/platforms
copying vllm/platforms/rocm.py -> build/lib.linux-x86_64-cpython-311/vllm/platforms
creating build/lib.linux-x86_64-cpython-311/vllm/triton_utils
copying vllm/triton_utils/importing.py -> build/lib.linux-x86_64-cpython-311/vllm/triton_utils
copying vllm/triton_utils/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/triton_utils
copying vllm/triton_utils/libentry.py -> build/lib.linux-x86_64-cpython-311/vllm/triton_utils
copying vllm/triton_utils/sample.py -> build/lib.linux-x86_64-cpython-311/vllm/triton_utils
copying vllm/triton_utils/custom_cache_manager.py -> build/lib.linux-x86_64-cpython-311/vllm/triton_utils
creating build/lib.linux-x86_64-cpython-311/vllm/logging
copying vllm/logging/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/logging
copying vllm/logging/formatter.py -> build/lib.linux-x86_64-cpython-311/vllm/logging
creating build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/tpu_worker.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/openvino_model_runner.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/cpu_model_runner.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/neuron_model_runner.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/model_runner_base.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/xpu_worker.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/model_runner.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/embedding_model_runner.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/xpu_model_runner.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/neuron_worker.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/worker_base.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/cpu_worker.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/tpu_model_runner.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/worker.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/cache_engine.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
copying vllm/worker/openvino_worker.py -> build/lib.linux-x86_64-cpython-311/vllm/worker
creating build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/dbrx.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/internvl.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/chatglm.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/jais.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/mlp_speculator.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/medusa.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/nemotron.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/falcon.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/mpt.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/configs
copying vllm/transformers_utils/configs/arctic.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/configs
creating build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/tokenizers
copying vllm/transformers_utils/tokenizers/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/tokenizers
copying vllm/transformers_utils/tokenizers/baichuan.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/tokenizers
creating build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/tokenizer_group
copying vllm/transformers_utils/tokenizer_group/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/tokenizer_group
copying vllm/transformers_utils/tokenizer_group/ray_tokenizer_group.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/tokenizer_group
copying vllm/transformers_utils/tokenizer_group/base_tokenizer_group.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/tokenizer_group
copying vllm/transformers_utils/tokenizer_group/tokenizer_group.py -> build/lib.linux-x86_64-cpython-311/vllm/transformers_utils/tokenizer_group
creating build/lib.linux-x86_64-cpython-311/vllm/lora/ops
copying vllm/lora/ops/bgmv_expand_slice.py -> build/lib.linux-x86_64-cpython-311/vllm/lora/ops
copying vllm/lora/ops/bgmv_expand.py -> build/lib.linux-x86_64-cpython-311/vllm/lora/ops
copying vllm/lora/ops/bgmv_shrink.py -> build/lib.linux-x86_64-cpython-311/vllm/lora/ops
copying vllm/lora/ops/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/lora/ops
copying vllm/lora/ops/sgmv_expand_slice.py -> build/lib.linux-x86_64-cpython-311/vllm/lora/ops
copying vllm/lora/ops/sgmv_shrink.py -> build/lib.linux-x86_64-cpython-311/vllm/lora/ops
copying vllm/lora/ops/sgmv_expand.py -> build/lib.linux-x86_64-cpython-311/vllm/lora/ops
copying vllm/lora/ops/utils.py -> build/lib.linux-x86_64-cpython-311/vllm/lora/ops
creating build/lib.linux-x86_64-cpython-311/vllm/attention/ops
copying vllm/attention/ops/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/ops
copying vllm/attention/ops/ipex_attn.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/ops
copying vllm/attention/ops/paged_attn.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/ops
copying vllm/attention/ops/prefix_prefill.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/ops
copying vllm/attention/ops/triton_flash_attention.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/ops
creating build/lib.linux-x86_64-cpython-311/vllm/attention/backends
copying vllm/attention/backends/torch_sdpa.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/backends
copying vllm/attention/backends/flashinfer.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/backends
copying vllm/attention/backends/abstract.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/backends
copying vllm/attention/backends/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/backends
copying vllm/attention/backends/blocksparse_attn.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/backends
copying vllm/attention/backends/xformers.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/backends
copying vllm/attention/backends/flash_attn.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/backends
copying vllm/attention/backends/ipex_attn.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/backends
copying vllm/attention/backends/utils.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/backends
copying vllm/attention/backends/rocm_flash_attn.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/backends
copying vllm/attention/backends/pallas.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/backends
copying vllm/attention/backends/openvino.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/backends
creating build/lib.linux-x86_64-cpython-311/vllm/attention/ops/blocksparse_attention
copying vllm/attention/ops/blocksparse_attention/interface.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/ops/blocksparse_attention
copying vllm/attention/ops/blocksparse_attention/blocksparse_attention_kernel.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/ops/blocksparse_attention
copying vllm/attention/ops/blocksparse_attention/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/ops/blocksparse_attention
copying vllm/attention/ops/blocksparse_attention/utils.py -> build/lib.linux-x86_64-cpython-311/vllm/attention/ops/blocksparse_attention
creating build/lib.linux-x86_64-cpython-311/vllm/distributed/device_communicators
copying vllm/distributed/device_communicators/tpu_communicator.py -> build/lib.linux-x86_64-cpython-311/vllm/distributed/device_communicators
copying vllm/distributed/device_communicators/custom_all_reduce.py -> build/lib.linux-x86_64-cpython-311/vllm/distributed/device_communicators
copying vllm/distributed/device_communicators/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/distributed/device_communicators
copying vllm/distributed/device_communicators/shm_broadcast.py -> build/lib.linux-x86_64-cpython-311/vllm/distributed/device_communicators
copying vllm/distributed/device_communicators/custom_all_reduce_utils.py -> build/lib.linux-x86_64-cpython-311/vllm/distributed/device_communicators
copying vllm/distributed/device_communicators/pynccl_wrapper.py -> build/lib.linux-x86_64-cpython-311/vllm/distributed/device_communicators
copying vllm/distributed/device_communicators/cuda_wrapper.py -> build/lib.linux-x86_64-cpython-311/vllm/distributed/device_communicators
copying vllm/distributed/device_communicators/pynccl.py -> build/lib.linux-x86_64-cpython-311/vllm/distributed/device_communicators
creating build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai
copying vllm/entrypoints/openai/cli_args.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai
copying vllm/entrypoints/openai/serving_tokenization.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai
copying vllm/entrypoints/openai/serving_chat.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai
copying vllm/entrypoints/openai/protocol.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai
copying vllm/entrypoints/openai/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai
copying vllm/entrypoints/openai/serving_embedding.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai
copying vllm/entrypoints/openai/serving_engine.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai
copying vllm/entrypoints/openai/run_batch.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai
copying vllm/entrypoints/openai/serving_completion.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai
copying vllm/entrypoints/openai/api_server.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai
copying vllm/entrypoints/openai/logits_processors.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai
creating build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai/rpc
copying vllm/entrypoints/openai/rpc/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai/rpc
copying vllm/entrypoints/openai/rpc/client.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai/rpc
copying vllm/entrypoints/openai/rpc/server.py -> build/lib.linux-x86_64-cpython-311/vllm/entrypoints/openai/rpc
creating build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/opt.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/dbrx.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/intern_vit.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/qwen2.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/siglip.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/deepseek_v2.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/gpt_j.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/commandr.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/chameleon.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/decilm.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/gpt_bigcode.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/internvl.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/gemma2.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/olmo.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/minicpmv.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/qwen.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/persimmon.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/chatglm.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/internlm2.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/mixtral_quant.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/idefics2_vision_model.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/phi3v.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/phi.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/gpt_neox.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/stablelm.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/llama.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/orion.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/jais.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/blip2.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/starcoder2.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/mixtral.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/minicpm3.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/gemma.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/gpt2.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/interfaces.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/mlp_speculator.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/baichuan.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/llava.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/bloom.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/medusa.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/nemotron.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/minicpm.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/jamba.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/llava_next.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/clip.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/fuyu.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/qwen2_moe.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/llama_embedding.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/phi3_small.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/deepseek.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/utils.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/falcon.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/blip.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/paligemma.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/mpt.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/xverse.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/arctic.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
copying vllm/model_executor/models/na_vit.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/models
creating build/lib.linux-x86_64-cpython-311/vllm/model_executor/model_loader
copying vllm/model_executor/model_loader/loader.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/model_loader
copying vllm/model_executor/model_loader/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/model_loader
copying vllm/model_executor/model_loader/tensorizer.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/model_loader
copying vllm/model_executor/model_loader/neuron.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/model_loader
copying vllm/model_executor/model_loader/utils.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/model_loader
copying vllm/model_executor/model_loader/weight_utils.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/model_loader
copying vllm/model_executor/model_loader/openvino.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/model_loader
creating build/lib.linux-x86_64-cpython-311/vllm/model_executor/guided_decoding
copying vllm/model_executor/guided_decoding/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/guided_decoding
copying vllm/model_executor/guided_decoding/outlines_decoding.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/guided_decoding
copying vllm/model_executor/guided_decoding/lm_format_enforcer_decoding.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/guided_decoding
copying vllm/model_executor/guided_decoding/guided_fields.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/guided_decoding
copying vllm/model_executor/guided_decoding/outlines_logits_processors.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/guided_decoding
creating build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
copying vllm/model_executor/layers/typical_acceptance_sampler.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
copying vllm/model_executor/layers/sampler.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
copying vllm/model_executor/layers/activation.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
copying vllm/model_executor/layers/logits_processor.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
copying vllm/model_executor/layers/layernorm.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
copying vllm/model_executor/layers/linear.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
copying vllm/model_executor/layers/spec_decode_base_sampler.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
copying vllm/model_executor/layers/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
copying vllm/model_executor/layers/pooler.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
copying vllm/model_executor/layers/rotary_embedding.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
copying vllm/model_executor/layers/rejection_sampler.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
copying vllm/model_executor/layers/vocab_parallel_embedding.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers
creating build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe
copying vllm/model_executor/layers/fused_moe/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe
copying vllm/model_executor/layers/fused_moe/layer.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe
copying vllm/model_executor/layers/fused_moe/moe_pallas.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe
copying vllm/model_executor/layers/fused_moe/fused_moe.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe
creating build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/base_config.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/marlin.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/squeezellm.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/deepspeedfp.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/qqq.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/kv_cache.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/gptq.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/fbgemm_fp8.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/awq.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/schema.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/gptq_marlin.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/aqlm.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/awq_marlin.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/fp8.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/gptq_marlin_24.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
copying vllm/model_executor/layers/quantization/bitsandbytes.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization
creating build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/ops
copying vllm/model_executor/layers/ops/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/ops
copying vllm/model_executor/layers/ops/sample.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/ops
copying vllm/model_executor/layers/ops/rand.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/ops
creating build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors
copying vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors
copying vllm/model_executor/layers/quantization/compressed_tensors/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors
copying vllm/model_executor/layers/quantization/compressed_tensors/utils.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors
creating build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/utils
copying vllm/model_executor/layers/quantization/utils/marlin_utils_test.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/utils
copying vllm/model_executor/layers/quantization/utils/marlin_utils.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/utils
copying vllm/model_executor/layers/quantization/utils/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/utils
copying vllm/model_executor/layers/quantization/utils/marlin_utils_test_qqq.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/utils
copying vllm/model_executor/layers/quantization/utils/marlin_utils_fp8.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/utils
copying vllm/model_executor/layers/quantization/utils/quant_utils.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/utils
copying vllm/model_executor/layers/quantization/utils/marlin_utils_test_24.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/utils
copying vllm/model_executor/layers/quantization/utils/w8a8_utils.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/utils
creating build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors/schemes
copying vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_w8a8_int8.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors/schemes
copying vllm/model_executor/layers/quantization/compressed_tensors/schemes/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors/schemes
copying vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_scheme.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors/schemes
copying vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_unquantized.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors/schemes
copying vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_w4a16_24.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors/schemes
copying vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_w8a16_fp8.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors/schemes
copying vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_wNa16.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors/schemes
copying vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_w8a8_fp8.py -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/quantization/compressed_tensors/schemes
creating build/lib.linux-x86_64-cpython-311/vllm/core/block
copying vllm/core/block/naive_block.py -> build/lib.linux-x86_64-cpython-311/vllm/core/block
copying vllm/core/block/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/core/block
copying vllm/core/block/block_table.py -> build/lib.linux-x86_64-cpython-311/vllm/core/block
copying vllm/core/block/cpu_gpu_block_allocator.py -> build/lib.linux-x86_64-cpython-311/vllm/core/block
copying vllm/core/block/prefix_caching_block.py -> build/lib.linux-x86_64-cpython-311/vllm/core/block
copying vllm/core/block/interfaces.py -> build/lib.linux-x86_64-cpython-311/vllm/core/block
copying vllm/core/block/utils.py -> build/lib.linux-x86_64-cpython-311/vllm/core/block
copying vllm/core/block/common.py -> build/lib.linux-x86_64-cpython-311/vllm/core/block
creating build/lib.linux-x86_64-cpython-311/vllm/engine/output_processor
copying vllm/engine/output_processor/util.py -> build/lib.linux-x86_64-cpython-311/vllm/engine/output_processor
copying vllm/engine/output_processor/multi_step.py -> build/lib.linux-x86_64-cpython-311/vllm/engine/output_processor
copying vllm/engine/output_processor/__init__.py -> build/lib.linux-x86_64-cpython-311/vllm/engine/output_processor
copying vllm/engine/output_processor/single_step.py -> build/lib.linux-x86_64-cpython-311/vllm/engine/output_processor
copying vllm/engine/output_processor/interfaces.py -> build/lib.linux-x86_64-cpython-311/vllm/engine/output_processor
copying vllm/engine/output_processor/stop_checker.py -> build/lib.linux-x86_64-cpython-311/vllm/engine/output_processor
copying vllm/py.typed -> build/lib.linux-x86_64-cpython-311/vllm
creating build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_A100-SXM4-40GB.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=8192,device_name=NVIDIA_H100_80GB_HBM3,dtype=float8.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=14336,device_name=AMD_Instinct_MI300X.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=2048,device_name=NVIDIA_H100_80GB_HBM3,dtype=float8.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=4096,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_H100_80GB_HBM3,dtype=float8.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=16,N=1344,device_name=NVIDIA_A100-SXM4-40GB.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=14336,device_name=NVIDIA_H100_80GB_HBM3,dtype=float8.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_A100-SXM4-40GB.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=16,N=1344,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=2048,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=64,N=1280,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=4096,device_name=NVIDIA_H100_80GB_HBM3,dtype=float8.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=64,N=640,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=2048,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=4096,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=16,N=1344,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=NVIDIA_H100_80GB_HBM3,dtype=float8.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=7168,device_name=AMD_Instinct_MI300X.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=16,N=2688,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=16,N=2688,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=1792,device_name=AMD_Instinct_MI300X.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=64,N=1280,device_name=NVIDIA_H100_80GB_HBM3.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=64,N=640,device_name=NVIDIA_A100-SXM4-80GB.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
copying vllm/model_executor/layers/fused_moe/configs/E=8,N=3584,device_name=AMD_Instinct_MI300X.json -> build/lib.linux-x86_64-cpython-311/vllm/model_executor/layers/fused_moe/configs
running build_ext
-- The CXX compiler identification is GNU 9.4.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Build type: RelWithDebInfo
-- Target device: cuda
-- Found Python: /home/user/anaconda3/envs/cpm/bin/python3.11 (found version "3.11.0") found components: Interpreter Development.Module Development.SABIModule
-- Found python matching: /home/user/anaconda3/envs/cpm/bin/python3.11.
-- Found CUDA: /usr/local/cuda-11.7 (found version "11.7")
CMake Error at /tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/cmake/data/share/cmake-3.30/Modules/CMakeDetermineCompilerId.cmake:838 (message):
Compiling the CUDA compiler identification source file
"CMakeCUDACompilerId.cu" failed.
Compiler: /usr/bin/nvcc
Build flags:
Id flags: --keep;--keep-dir;tmp -v
The output was:
255
#$ _SPACE_=
#$ _CUDART_=cudart
#$ _HERE_=/usr/lib/nvidia-cuda-toolkit/bin
#$ _THERE_=/usr/lib/nvidia-cuda-toolkit/bin
#$ _TARGET_SIZE_=
#$ _TARGET_DIR_=
#$ _TARGET_SIZE_=64
#$ NVVMIR_LIBRARY_DIR=/usr/lib/nvidia-cuda-toolkit/libdevice
#$
PATH=/usr/lib/nvidia-cuda-toolkit/bin:/tmp/pip-build-env-_q6a864r/overlay/bin:/tmp/pip-build-env-_q6a864r/normal/bin:/home/user/.vscode-server/cli/servers/Stable-4849ca9bdf9666755eb463db297b69e5385090e3/server/bin/remote-cli:/usr/local/cuda-11.7/bin:/home/user/anaconda3/envs/cpm/bin:/home/user/anaconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/user/anaconda3/envs/pytorch38/bin:/usr/local/cuda-11.7/bin:/home/user/.vscode-server/cli/servers/Stable-4849ca9bdf9666755eb463db297b69e5385090e3/server/bin/remote-cli:/usr/local/cuda-11.7/bin:/home/user/anaconda3/bin:/home/user/anaconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/user/anaconda3/envs/pytorch38/bin:/home/user/.vscode-server/cli/servers/Stable-4849ca9bdf9666755eb463db297b69e5385090e3/server/bin/remote-cli:/usr/local/cuda-11.7/bin:/home/user/anaconda3/bin:/home/user/anaconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/user/anaconda3/envs/pytorch38/bin:/home/user/anaconda3/envs/pytorch38/bin
#$ LIBRARIES= -L/usr/lib/x86_64-linux-gnu/stubs -L/usr/lib/x86_64-linux-gnu
#$ rm tmp/a_dlink.reg.c
#$ gcc -D__CUDA_ARCH__=300 -E -x c++ -DCUDA_DOUBLE_MATH_FUNCTIONS
-D__CUDACC__ -D__NVCC__ -D__CUDACC_VER_MAJOR__=10 -D__CUDACC_VER_MINOR__=1
-D__CUDACC_VER_BUILD__=243 -include "cuda_runtime.h" -m64
"CMakeCUDACompilerId.cu" > "tmp/CMakeCUDACompilerId.cpp1.ii"
#$ cicc --c++14 --gnu_version=90400 --allow_managed -arch compute_30 -m64
-ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 --include_file_name
"CMakeCUDACompilerId.fatbin.c" -tused -nvvmir-library
"/usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.10.bc"
--gen_module_id_file --module_id_file_name
"tmp/CMakeCUDACompilerId.module_id" --orig_src_file_name
"CMakeCUDACompilerId.cu" --gen_c_file_name
"tmp/CMakeCUDACompilerId.cudafe1.c" --stub_file_name
"tmp/CMakeCUDACompilerId.cudafe1.stub.c" --gen_device_file_name
"tmp/CMakeCUDACompilerId.cudafe1.gpu" "tmp/CMakeCUDACompilerId.cpp1.ii" -o
"tmp/CMakeCUDACompilerId.ptx"
#$ ptxas -arch=sm_30 -m64 "tmp/CMakeCUDACompilerId.ptx" -o
"tmp/CMakeCUDACompilerId.sm_30.cubin"
ptxas fatal : Value 'sm_30' is not defined for option 'gpu-name'
# --error 0xff --
Call Stack (most recent call first):
/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/cmake/data/share/cmake-3.30/Modules/CMakeDetermineCompilerId.cmake:8 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/cmake/data/share/cmake-3.30/Modules/CMakeDetermineCompilerId.cmake:53 (__determine_compiler_id_test)
/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/cmake/data/share/cmake-3.30/Modules/CMakeDetermineCUDACompiler.cmake:131 (CMAKE_DETERMINE_COMPILER_ID)
/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/torch/share/cmake/Caffe2/public/cuda.cmake:47 (enable_language)
/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
CMakeLists.txt:67 (find_package)
-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
File "/home/user/anaconda3/envs/cpm/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/home/user/anaconda3/envs/cpm/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/anaconda3/envs/cpm/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
return _build_backend().build_wheel(wheel_directory, config_settings,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 421, in build_wheel
return self._build_with_temp_dir(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 403, in _build_with_temp_dir
self.run_setup()
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 318, in run_setup
exec(code, locals())
File "<string>", line 456, in <module>
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/__init__.py", line 117, in setup
return distutils.core.setup(**attrs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 184, in setup
return run_commands(dist)
^^^^^^^^^^^^^^^^^^
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
dist.run_commands()
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 954, in run_commands
self.run_command(cmd)
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/dist.py", line 950, in run_command
super().run_command(command)
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/command/bdist_wheel.py", line 384, in run
self.run_command("build")
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
self.distribution.run_command(command)
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/dist.py", line 950, in run_command
super().run_command(command)
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
self.distribution.run_command(command)
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/dist.py", line 950, in run_command
super().run_command(command)
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
cmd_obj.run()
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 98, in run
_build_ext.run(self)
File "/tmp/pip-build-env-_q6a864r/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
self.build_extensions()
File "<string>", line 219, in build_extensions
File "<string>", line 201, in configure
File "/home/user/anaconda3/envs/cpm/lib/python3.11/subprocess.py", line 413, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '/tmp/pip-req-build-jvw7lpre', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/tmp/pip-req-build-jvw7lpre/build/lib.linux-x86_64-cpython-311/vllm', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=build/temp.linux-x86_64-cpython-311', '-DVLLM_TARGET_DEVICE=cuda', '-DVLLM_PYTHON_EXECUTABLE=/home/user/anaconda3/envs/cpm/bin/python3.11', '-DNVCC_THREADS=1', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=24']' returned non-zero exit status 1.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for vllm
Failed to build vllm
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (vllm) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Your current environment
How you are installing vllm
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: