Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] 部署s-lora报错ValueError: not enough values to unpack (expected 2, got 1) #1030

Closed
2 tasks done
lzq603 opened this issue Jan 24, 2024 · 14 comments · Fixed by #1042
Closed
2 tasks done

[Bug] 部署s-lora报错ValueError: not enough values to unpack (expected 2, got 1) #1030

lzq603 opened this issue Jan 24, 2024 · 14 comments · Fixed by #1042
Assignees

Comments

@lzq603
Copy link

lzq603 commented Jan 24, 2024

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.

Describe the bug

我使用项目https://github.com/edw008/LLaMA-Efficient-Tuning 训练了Baichuan2-13B-Chat模型,得到一个lora权重,在使用本项目进行部署时出现错误,请问我应该怎样部署?
我使用的命令是:lmdeploy chat torch --tp 2 --session-len 1024 --adapters ~/autodl-fs/lora/2024-01-09-11-53-38_all /root/autodl-tmp/baichuan-inc/Baichuan2-13B-Chat
1706060404205

Reproduction

lmdeploy chat torch --tp 2 --session-len 1024 --adapters ~/autodl-fs/lora/2024-01-09-11-53-38_all /root/autodl-tmp/baichuan-inc/Baichuan2-13B-Chat
微信截图_20240124094516

Environment

- 操作系统:Ubuntu 20.04
- GPU:RTX 4090(24GB) * 3
- python:3.9.18
- pytorch:2.1.2

Error traceback

(aichat) root@autodl-container-92f24ab122-1ef12511:~# lmdeploy chat torch --tp 2 --session-len 1024 --adapters ~/autodl-fs/lora/2024-01-09-11-53-38_all /root/autodl-tmp/baichuan-inc/Baichuan2-13B-Chat
/root/miniconda3/envs/aichat/lib/python3.9/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
  warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
Traceback (most recent call last):
  File "/root/miniconda3/envs/aichat/bin/lmdeploy", line 8, in <module>
    sys.exit(run())
  File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/cli/entrypoint.py", line 15, in run
    args = parser.parse_args()
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 1825, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 1858, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 2049, in _parse_known_args
    positionals_end_index = consume_positionals(start_index)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 2026, in consume_positionals
    take_action(action, args)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 1935, in take_action
    action(self, namespace, argument_values, option_string)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 1214, in __call__
    subnamespace, arg_strings = parser.parse_known_args(arg_strings, None)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 1858, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 2049, in _parse_known_args
    positionals_end_index = consume_positionals(start_index)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 2026, in consume_positionals
    take_action(action, args)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 1935, in take_action
    action(self, namespace, argument_values, option_string)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 1214, in __call__
    subnamespace, arg_strings = parser.parse_known_args(arg_strings, None)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 1858, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 2067, in _parse_known_args
    start_index = consume_optional(start_index)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 2007, in consume_optional
    take_action(action, args, option_string)
  File "/root/miniconda3/envs/aichat/lib/python3.9/argparse.py", line 1935, in take_action
    action(self, namespace, argument_values, option_string)
  File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/mmengine/config/config.py", line 1833, in __call__
    key, val = kv.split('=', maxsplit=1)
ValueError: not enough values to unpack (expected 2, got 1)
@grimoire
Copy link
Collaborator

grimoire commented Jan 25, 2024

adapters input should be key-value pair.
--adapters <adapter_name>=<adapter_path>

@lzq603
Copy link
Author

lzq603 commented Jan 25, 2024

更新代码后,出现了新的错误,参数使用单个路径或键值对都得到了相同错误
(aichat) root@autodl-container-646c448112-e709f832:~# lmdeploy chat torch --tp=2 --session-len=1024 --adapters=/root/autodl-fs/lora/2024-01-09-11-53-38_all /root/autodl-tmp/baichuan-inc/Baichuan2-13B-Chat
/root/miniconda3/envs/aichat/lib/python3.9/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
/root/miniconda3/envs/aichat/lib/python3.9/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
/root/miniconda3/envs/aichat/lib/python3.9/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:06<00:00, 1.14s/it]
01/25 15:07:27 - lmdeploy - INFO - load adapter from "/root/autodl-fs/lora/2024-01-09-11-53-38_all".
You shouldn't move a model when it is dispatched on multiple devices.
01/25 15:07:27 - lmdeploy - INFO - distribute model parameters.
01/25 15:07:28 - lmdeploy - ERROR - /root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py - _start_tp_process - 853 - Rank[1] failed.
01/25 15:07:28 - lmdeploy - ERROR - /root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py - _start_tp_process - 853 - Rank[0] failed.
Traceback (most recent call last):
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 850, in _start_tp_process
func(rank, *args, **kwargs)
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 789, in _tp_model_loop
patched_model, cache_engine = _tp_build_model(
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 667, in _tp_build_model
resp.raise_error(RuntimeError('failed to init model.'))
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 567, in raise_error
raise err
KeyError: 'parameter name can't contain "."'
Traceback (most recent call last):
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 850, in _start_tp_process
func(rank, *args, **kwargs)
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 789, in _tp_model_loop
patched_model, cache_engine = _tp_build_model(
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 667, in _tp_build_model
resp.raise_error(RuntimeError('failed to init model.'))
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 567, in raise_error
raise err
KeyError: 'parameter name can't contain "."'
01/25 15:07:28 - lmdeploy - ERROR - /root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py - patch_model_tp - 962 - Init tp model failed with error: [KeyError('parameter name can't contain "."'), KeyError('parameter name can't contain "."')]
Traceback (most recent call last):
File "/root/miniconda3/envs/aichat/bin/lmdeploy", line 8, in
sys.exit(run())
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/cli/entrypoint.py", line 18, in run
args.run(args)
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/cli/chat.py", line 80, in torch
run_chat(args.model_path,
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/chat.py", line 66, in run_chat
tm_model = Engine.from_pretrained(model_path,
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/engine.py", line 173, in from_pretrained
return cls(model_path=pretrained_model_name_or_path,
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/engine.py", line 115, in init
self.model_agent = AutoModelAgent.from_pretrained(
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 427, in from_pretrained
return build_model_agent(pretrained_model_name_or_path,
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 1011, in build_model_agent
model_agent = TPModelAgent(model_path,
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 904, in init
self.patch_model_tp(model_path,
File "/root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py", line 963, in patch_model_tp
raise next(err for err in resp.error if err is not None)
KeyError: 'parameter name can't contain "."'

@grimoire
Copy link
Collaborator

Please provide a dummy adapter so I can do the debug.

@lzq603
Copy link
Author

lzq603 commented Jan 25, 2024

这是一个用LLaMA-Efficient-Tuning官方数据集,用Baichuan2-13B-Chat训练的自我认知adapter。加载时出现上述错误
zwrz.zip

微信截图_20240125153725

@grimoire
Copy link
Collaborator

#1042

@lzq603
Copy link
Author

lzq603 commented Jan 25, 2024

可以跑起来了,但是为什么没有效果呢,推理效果不跟上面那张图一样?
微信截图_20240125194508

@grimoire
Copy link
Collaborator

Updated, should be right now.
Sorry I forgot to do scaling in lora linear.

@grimoire grimoire linked a pull request Jan 25, 2024 that will close this issue
@lzq603
Copy link
Author

lzq603 commented Jan 25, 2024

(aichat) root@autodl-container-9d254dac50-f7306f7d:# lmdeploy chat torch --tp 2 --session-len 1024 --adapters /root/autodl-fs/lora/2024-01-09-11-53-38_all /root/autodl-tmp/baichuan-inc/Baichuan2-13B-Chat
/root/miniconda3/envs/aichat/lib/python3.9/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
usage: lmdeploy chat torch [-h] [--model-name MODEL_NAME] [--tp TP] [--session-len SESSION_LEN] [--adapters [ADAPTERS ...]] [--trust-remote-code] model_path
lmdeploy chat torch: error: the following arguments are required: model_path
(aichat) root@autodl-container-9d254dac50-f7306f7d:
# lmdeploy chat torch --tp=2 --session-len=1024 --adapters=/root/autodl-fs/lora/2024-01-09-11-53-38_all /root/autodl-tmp/baichuan-inc/Baichuan2-13B-Chat
/root/miniconda3/envs/aichat/lib/python3.9/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
/root/miniconda3/envs/aichat/lib/python3.9/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
/root/miniconda3/envs/aichat/lib/python3.9/site-packages/fuzzywuzzy/fuzz.py:11: UserWarning: Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning
warnings.warn('Using slow pure-python SequenceMatcher. Install python-Levenshtein to remove this warning')
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:06<00:00, 1.16s/it]
01/25 22:13:36 - lmdeploy - INFO - load adapter from "/root/autodl-fs/lora/2024-01-09-11-53-38_all".
01/25 22:13:37 - lmdeploy - INFO - distribute model parameters.
01/25 22:13:43 - lmdeploy - ERROR - /root/miniconda3/envs/aichat/lib/python3.9/site-packages/lmdeploy/pytorch/engine/model_agent.py - _tp_build_model - 684 - rank[1] failed with error: CUDA out of memory. Tried to allocate 1.20 GiB. GPU 1 has a total capacty of 23.65 GiB of which 490.50 MiB is free. Process 104192 has 9.20 GiB memory in use. Process 104193 has 13.96 GiB memory in use. Of the allocated memory 13.00 GiB is allocated by PyTorch, and 262.96 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
01/25 22:13:43 - lmdeploy - WARNING - infered block size: 8

@grimoire
Copy link
Collaborator

are you using the latest branch? Model memory usage should be ~14.5G after commit 5ae19a1.

@lzq603
Copy link
Author

lzq603 commented Jan 26, 2024

Baichuan2-13B-Chat有26G。重新下载了分支代码,用两张4090运行成功了,占用46.5G
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 On | 00000000:98:00.0 Off | Off |
| 0% 35C P8 20W / 450W | 23494MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce RTX 4090 On | 00000000:99:00.0 Off | Off |
| 0% 36C P2 76W / 450W | 23122MiB / 24564MiB | 100% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
+---------------------------------------------------------------------------------------+

@lzq603
Copy link
Author

lzq603 commented Jan 26, 2024

但还有个问题,怎么配置多个adapter?

@grimoire
Copy link
Collaborator

from lmdeploy.messages import PytorchEngineConfig
from lmdeploy.pytorch.engine.engine import Engine
adapters = {'adapter0':'/path/to/adapter0', 'adapter1':'/path/to/adapter1'}
engine_config = PytorchEngineConfig(adapters=adapters)
engine = Engine.from_pretrained(model_path,
                                engine_config=engine_config,
                                trust_remote_code=True)
generator = engine.create_instance()

for outputs in generator.stream_infer(session_id=session_id,
                                      input_ids=input_ids,
                                      gen_config=gen_config,
                                      adapter_name='adapter0'):
    # read outputs

# close session and release caches
generator.end(session_id)

@amulil
Copy link
Contributor

amulil commented Jan 26, 2024

from lmdeploy.messages import PytorchEngineConfig
from lmdeploy.pytorch.engine.engine import Engine
adapters = {'adapter0':'/path/to/adapter0', 'adapter1':'/path/to/adapter1'}
engine_config = PytorchEngineConfig(adapters=adapters)
engine = Engine.from_pretrained(model_path,
                                engine_config=engine_config,
                                trust_remote_code=True)
generator = engine.create_instance()

for outputs in generator.stream_infer(session_id=session_id,
                                      input_ids=input_ids,
                                      gen_config=gen_config,
                                      adapter_name='adapter0'):
    # read outputs

# close session and release caches
generator.end(session_id)

@grimoire restful API 支持配置多个 adapter 吗?

@grimoire
Copy link
Collaborator

@amulil

comming soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants