Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vllm 0.4.2报错 Model architectures ['MiniCPMV'] are not supported for now #82

Closed
ye7love7 opened this issue May 9, 2024 · 3 comments
Labels
documentation Improvements or additions to documentation feature

Comments

@ye7love7
Copy link

ye7love7 commented May 9, 2024

(infer) tskj@tskj:~/project$ python -m vllm.entrypoints.openai.api_server --model /home/tskj/MOD/MiniCPM-V-2 --trust-remote-code --host 0.0.0.0 --port 9998 --gpu-memory-utilization 0.45
INFO 05-09 07:46:56 api_server.py:149] vLLM API server version 0.4.0.post1
INFO 05-09 07:46:56 api_server.py:150] args: Namespace(host='0.0.0.0', port=9998, uvicorn_log_level='info', allow_credentials=False, allowed_origins=[''], allowed_methods=[''], allowed_headers=['*'], api_key=None, served_model_name=None, lora_modules=None, chat_template=None, response_role='assistant', ssl_keyfile=None, ssl_certfile=None, ssl_ca_certs=None, ssl_cert_reqs=0, root_path=None, middleware=[], model='/home/tskj/MOD/MiniCPM-V-2', tokenizer=None, revision=None, code_revision=None, tokenizer_revision=None, tokenizer_mode='auto', trust_remote_code=True, download_dir=None, load_format='auto', dtype='auto', kv_cache_dtype='auto', max_model_len=None, worker_use_ray=False, pipeline_parallel_size=1, tensor_parallel_size=1, max_parallel_loading_workers=None, ray_workers_use_nsight=False, block_size=16, enable_prefix_caching=False, use_v2_block_manager=False, num_lookahead_slots=0, seed=0, swap_space=4, gpu_memory_utilization=0.45, forced_num_gpu_blocks=None, max_num_batched_tokens=None, max_num_seqs=256, max_logprobs=5, disable_log_stats=False, quantization=None, enforce_eager=False, max_context_len_to_capture=8192, disable_custom_all_reduce=False, tokenizer_pool_size=0, tokenizer_pool_type='ray', tokenizer_pool_extra_config=None, enable_lora=False, max_loras=1, max_lora_rank=16, lora_extra_vocab_size=256, lora_dtype='auto', max_cpu_loras=None, device='auto', image_input_type=None, image_token_id=None, image_input_shape=None, image_feature_size=None, scheduler_delay_factor=0.0, enable_chunked_prefill=False, engine_use_ray=False, disable_log_requests=False, max_log_len=None)
Traceback (most recent call last):
File "/home/tskj/miniconda3/envs/infer/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/tskj/miniconda3/envs/infer/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/tskj/miniconda3/envs/infer/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 157, in
engine = AsyncLLMEngine.from_engine_args(
File "/home/tskj/miniconda3/envs/infer/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 331, in from_engine_args
engine_configs = engine_args.create_engine_configs()
File "/home/tskj/miniconda3/envs/infer/lib/python3.10/site-packages/vllm/engine/arg_utils.py", line 390, in create_engine_configs
model_config = ModelConfig(
File "/home/tskj/miniconda3/envs/infer/lib/python3.10/site-packages/vllm/config.py", line 121, in init
self.hf_config = get_config(self.model, trust_remote_code, revision,
File "/home/tskj/miniconda3/envs/infer/lib/python3.10/site-packages/vllm/transformers_utils/config.py", line 22, in get_config
config = AutoConfig.from_pretrained(
File "/home/tskj/miniconda3/envs/infer/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 937, in from_pretrained
config_class = get_class_from_dynamic_module(
File "/home/tskj/miniconda3/envs/infer/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 501, in get_class_from_dynamic_module
return get_class_in_module(class_name, final_module)
File "/home/tskj/miniconda3/envs/infer/lib/python3.10/site-packages/transformers/dynamic_module_utils.py", line 201, in get_class_in_module
module = importlib.machinery.SourceFileLoader(name, module_path).load_module()
File "", line 548, in _check_name_wrapper
File "", line 1063, in load_module
File "", line 888, in load_module
File "", line 290, in _load_module_shim
File "", line 719, in _load
File "", line 688, in _load_unlocked
File "", line 879, in exec_module
File "", line 1017, in get_code
File "", line 947, in source_to_code
File "", line 241, in _call_with_frames_removed
File "/home/tskj/.cache/huggingface/modules/transformers_modules/MiniCPM-V-2/configuration_minicpm.py", line 37
<style>.blob-line-num::before {
^
SyntaxError: invalid decimal literal
模型文件是从huggingface下载,请问应该如何解决?

@ye7love7 ye7love7 changed the title vllm 0.4.1post报错 vllm 0.4.1post报错 Model architectures ['MiniCPMV'] are not supported for now May 9, 2024
@ye7love7
Copy link
Author

ye7love7 commented May 9, 2024

更新了vllm到0.4.2,报错:
[rank0]: raise ValueError(
[rank0]: ValueError: Model architectures ['MiniCPMV'] are not supported for now. Supported architectures: ['AquilaModel', 'AquilaForCausalLM', 'BaiChuanForCausalLM', 'BaichuanForCausalLM', 'BloomForCausalLM', 'ChatGLMModel', 'ChatGLMForConditionalGeneration', 'CohereForCausalLM', 'DbrxForCausalLM', 'DeciLMForCausalLM', 'DeepseekForCausalLM', 'FalconForCausalLM', 'GemmaForCausalLM', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTJForCausalLM', 'GPTNeoXForCausalLM', 'InternLMForCausalLM', 'InternLM2ForCausalLM', 'JAISLMHeadModel', 'LlamaForCausalLM', 'LlavaForConditionalGeneration', 'LLaMAForCausalLM', 'MistralForCausalLM', 'MixtralForCausalLM', 'QuantMixtralForCausalLM', 'MptForCausalLM', 'MPTForCausalLM', 'MiniCPMForCausalLM', 'OlmoForCausalLM', 'OPTForCausalLM', 'OrionForCausalLM', 'PhiForCausalLM', 'Phi3ForCausalLM', 'QWenLMHeadModel', 'Qwen2ForCausalLM', 'Qwen2MoeForCausalLM', 'RWForCausalLM', 'StableLMEpochForCausalLM', 'StableLmForCausalLM', 'Starcoder2ForCausalLM', 'XverseForCausalLM']

@ye7love7 ye7love7 changed the title vllm 0.4.1post报错 Model architectures ['MiniCPMV'] are not supported for now vllm 0.4.2报错 Model architectures ['MiniCPMV'] are not supported for now May 9, 2024
@ye7love7
Copy link
Author

ye7love7 commented May 9, 2024

查看了源码,vllm官方还不支持,使用了readme的方法,发生以下报错:
subprocess.CalledProcessError: Command '['cmake', '/home/tskj/project/vllm', '-G', 'Ninja', '-DCMAKE_BUILD_TYPE=RelWithDebInfo', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/tmp/tmpf3n8hwcm.build-lib/vllm', '-DCMAKE_ARCHIVE_OUTPUT_DIRECTORY=/tmp/tmpnhuiosjo.build-temp', '-DVLLM_TARGET_DEVICE=cuda', '-DVLLM_PYTHON_EXECUTABLE=/home/tskj/miniconda3/envs/py310/bin/python', '-DNVCC_THREADS=1', '-DCMAKE_JOB_POOL_COMPILE:STRING=compile', '-DCMAKE_JOB_POOLS:STRING=compile=56']' returned non-zero exit status 1.

@jeejeelee
Copy link

可以参考:vllm-project/vllm#4087 拉取相关的分支进行测试

@Cuiunbo Cuiunbo closed this as completed May 23, 2024
@Cuiunbo Cuiunbo added documentation Improvements or additions to documentation feature labels May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation feature
Projects
None yet
Development

No branches or pull requests

3 participants