Skip to content

Commit

Permalink
Merge branch 'main' into release/1.5
Browse files Browse the repository at this point in the history
  • Loading branch information
Jintao-Huang committed Jan 31, 2024
2 parents 3bd24eb + 9c29715 commit 5eb8694
Show file tree
Hide file tree
Showing 7 changed files with 19 additions and 6 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ app_ui_main(infer_args)
- [zephyr](https://github.com/huggingface/alignment-handbook) series: zephyr-7b-beta-chat.
- [ziya](https://github.com/IDEA-CCNL/Fengshenbang-LM) series: ziya2-13b, ziya2-13b-chat.
- [skywork](https://github.com/SkyworkAI/Skywork) series: skywork-13b, skywork-13b-chat.
- other: [polylm-13b](https://github.com/DAMO-NLP-MT/PolyLM), [seqgpt-560m](https://github.com/Alibaba-NLP/SeqGPT), [sus-34b-chat](https://github.com/SUSTech-IDEA/SUS-Chat), [openbmb-minicpm-2b](https://github.com/OpenBMB/CPM-Bee).
- other: [polylm-13b](https://github.com/DAMO-NLP-MT/PolyLM), [seqgpt-560m](https://github.com/Alibaba-NLP/SeqGPT), [sus-34b-chat](https://github.com/SUSTech-IDEA/SUS-Chat), [openbmb-minicpm-2b-chat](https://github.com/OpenBMB/mlc-MiniCPM).
- Financial:
- [tongyi-finance](https://github.com/QwenLM/Qwen) series: tongyi-finance-14b, tongyi-finance-14b-chat, tongyi-finance-14b-chat-int4.
- Coding:
Expand Down Expand Up @@ -466,6 +466,7 @@ You can contact and communicate with us by joining our WeChat Group:
<img src="asset/wechat.png" width="250" style="display: inline-block;">
</p>


## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=modelscope/swift&type=Date)](https://star-history.com/#modelscope/swift&Date)
2 changes: 1 addition & 1 deletion README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ app_ui_main(infer_args)
- [zephyr](https://github.com/huggingface/alignment-handbook) 系列: zephyr-7b-beta-chat.
- [ziya](https://github.com/IDEA-CCNL/Fengshenbang-LM) 系列: ziya2-13b, ziya2-13b-chat.
- [skywork](https://github.com/SkyworkAI/Skywork) 系列: skywork-13b, skywork-13b-chat.
- other: [polylm-13b](https://github.com/DAMO-NLP-MT/PolyLM), [seqgpt-560m](https://github.com/Alibaba-NLP/SeqGPT), [sus-34b-chat](https://github.com/SUSTech-IDEA/SUS-Chat), [openbmb-minicpm-2b](https://github.com/OpenBMB/CPM-Bee).
- other: [polylm-13b](https://github.com/DAMO-NLP-MT/PolyLM), [seqgpt-560m](https://github.com/Alibaba-NLP/SeqGPT), [sus-34b-chat](https://github.com/SUSTech-IDEA/SUS-Chat), [openbmb-minicpm-2b-chat](https://github.com/OpenBMB/mlc-MiniCPM).
- 金融:
- [tongyi-finance](https://github.com/QwenLM/Qwen) 系列: tongyi-finance-14b, tongyi-finance-14b-chat, tongyi-finance-14b-chat-int4.
- 代码:
Expand Down
2 changes: 1 addition & 1 deletion docs/source/LLM/支持的模型和数据集.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@
|zephyr-7b-beta-chat|[modelscope/zephyr-7b-beta](https://modelscope.cn/models/modelscope/zephyr-7b-beta/summary)|q_proj, k_proj, v_proj|zephyr|&#x2714;|&#x2714;|transformers>=4.34|
|polylm-13b|[damo/nlp_polylm_13b_text_generation](https://modelscope.cn/models/damo/nlp_polylm_13b_text_generation/summary)|c_attn|default-generation|&#x2718;|&#x2718;||
|seqgpt-560m|[damo/nlp_seqgpt-560m](https://modelscope.cn/models/damo/nlp_seqgpt-560m/summary)|query_key_value|default-generation|&#x2718;|&#x2714;||
|openbmb-minicpm-2b|[OpenBMB/miniCPM-bf16](https://modelscope.cn/models/OpenBMB/miniCPM-bf16/summary)|q_proj, k_proj, v_proj|openbmb|&#x2714;|&#x2718;||
|openbmb-minicpm-2b-chat|[OpenBMB/miniCPM-bf16](https://modelscope.cn/models/OpenBMB/miniCPM-bf16/summary)|q_proj, k_proj, v_proj|openbmb|&#x2714;|&#x2718;||
|sus-34b-chat|[SUSTC/SUS-Chat-34B](https://modelscope.cn/models/SUSTC/SUS-Chat-34B/summary)|q_proj, k_proj, v_proj|sus|&#x2714;|&#x2714;||
|tongyi-finance-14b|[TongyiFinance/Tongyi-Finance-14B](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B/summary)|c_attn|default-generation|&#x2714;|&#x2714;||
|tongyi-finance-14b-chat|[TongyiFinance/Tongyi-Finance-14B-Chat](https://modelscope.cn/models/TongyiFinance/Tongyi-Finance-14B-Chat/summary)|c_attn|qwen|&#x2714;|&#x2714;||
Expand Down
4 changes: 2 additions & 2 deletions swift/llm/utils/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ class ModelType:
# other
polylm_13b = 'polylm-13b'
seqgpt_560m = 'seqgpt-560m'
openbmb_minicpm_2b = 'openbmb-minicpm-2b'
openbmb_minicpm_2b_chat = 'openbmb-minicpm-2b-chat'
sus_34b_chat = 'sus-34b-chat'

# domain-specific
Expand Down Expand Up @@ -1851,7 +1851,7 @@ def get_model_tokenizer_yi_vl(model_dir: str,


@register_model(
ModelType.openbmb_minicpm_2b,
ModelType.openbmb_minicpm_2b_chat,
'OpenBMB/miniCPM-bf16',
LoRATM.llama2,
TemplateType.openbmb,
Expand Down
2 changes: 1 addition & 1 deletion swift/llm/utils/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -365,7 +365,7 @@ def find_all_linear_for_lora(model: Module, quantization_bit: int,
lora_module_names = set()
for name, module in model.named_modules():
if isinstance(module, linear_cls):
module_name = name.split('.')[-1]
module_name = '.'.join(name.split('.')[-2:])
if head_module_name not in module_name:
lora_module_names.add(module_name)
return list(lora_module_names)
Expand Down
6 changes: 6 additions & 0 deletions swift/tuners/lora_layers.py
Original file line number Diff line number Diff line change
Expand Up @@ -399,6 +399,9 @@ def _create_and_replace(
)
self._convert_dtype(target, lora_config.lora_dtype)
elif isinstance(target, linear_types):
if target.__class__.__name__ == 'NonDynamicallyQuantizableLinear':
# Fix issue: https://github.com/modelscope/swift/issues/342
return
target.update_layer(
adapter_name,
r,
Expand Down Expand Up @@ -496,6 +499,9 @@ def _create_new_module(lora_config, adapter_name, target, **kwargs):
enable_lora=lora_config.enable_lora,
**kwargs)
elif isinstance(target_base_layer, torch.nn.Linear):
if target_base_layer.__class__.__name__ == 'NonDynamicallyQuantizableLinear':
# Fix issue: https://github.com/modelscope/swift/issues/342
return None
if kwargs['fan_in_fan_out']:
warnings.warn(
'fan_in_fan_out is set to True but the target module is `torch.nn.Linear`. '
Expand Down
6 changes: 6 additions & 0 deletions tests/llm/test_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ def test_basic(self):
sft_args = SftArguments(
model_type=model_type,
template_type='AUTO',
lora_target_modules='ALL',
quantization_bit=quantization_bit,
batch_size=2,
eval_steps=5,
Expand Down Expand Up @@ -138,6 +139,7 @@ def test_vl_audio(self):
template_type='AUTO',
eval_steps=5,
check_dataset_strategy='warning',
lora_target_modules='ALL',
train_dataset_sample=200,
dataset=[dataset],
output_dir=output_dir,
Expand Down Expand Up @@ -245,6 +247,7 @@ def test_cogagent_instruct(self):
model_type=ModelType.cogagent_18b_instruct,
dataset=DatasetName.coco_mini_en_2,
train_dataset_sample=100,
lora_target_modules='ALL',
eval_steps=5,
quantization_bit=4))
best_model_checkpoint = output['best_model_checkpoint']
Expand All @@ -263,6 +266,7 @@ def test_xcomposer_chat(self):
SftArguments(
model_type=ModelType.internlm_xcomposer2_7b_chat,
dataset=DatasetName.coco_mini_en,
lora_target_modules='DEFAULT',
train_dataset_sample=100,
eval_steps=5))
best_model_checkpoint = output['best_model_checkpoint']
Expand All @@ -282,6 +286,7 @@ def test_yi_vl_6b_chat(self):
SftArguments(
model_type=ModelType.yi_vl_6b_chat,
# dataset=DatasetName.capcha_images,
lora_target_modules='ALL',
train_dataset_sample=100,
eval_steps=5,
custom_train_dataset_path=[
Expand All @@ -303,6 +308,7 @@ def test_dpo(self):
output = dpo_main(
DPOArguments(
model_type=ModelType.qwen_1_8b_chat,
sft_type='full',
dataset=DatasetName.hh_rlhf,
train_dataset_sample=100,
eval_steps=5))
Expand Down

0 comments on commit 5eb8694

Please sign in to comment.