support swift app (#2792)

modelscope · Dec 29, 2024 · 1b132f6 · 1b132f6
1 parent d811972
commit 1b132f6
Show file tree

Hide file tree

Showing 58 changed files with 527 additions and 251 deletions.
diff --git a/README.md b/README.md
@@ -12,7 +12,7 @@
 </p>
 
 <p align="center">
-<img src="https://img.shields.io/badge/python-%E2%89%A53.8-5be.svg">
+<img src="https://img.shields.io/badge/python-3.10-5be.svg">
 <img src="https://img.shields.io/badge/pytorch-%E2%89%A52.0-orange.svg">
 <a href="https://github.com/modelscope/modelscope/"><img src="https://img.shields.io/badge/modelscope-%E2%89%A51.19-5D91D4.svg"></a>
 <a href="https://pypi.org/project/ms-swift/"><img src="https://badge.fury.io/py/ms-swift.svg"></a>
@@ -279,6 +279,15 @@ CUDA_VISIBLE_DEVICES=0 swift infer \
     --max_new_tokens 2048
 ```
 
+### Interface Inference
+```shell
+CUDA_VISIBLE_DEVICES=0 swift app \
+    --model Qwen/Qwen2.5-7B-Instruct \
+    --stream true \
+    --infer_backend pt \
+    --max_new_tokens 2048
+```
+
 ### Deployment
 ```shell
 CUDA_VISIBLE_DEVICES=0 swift deploy \

diff --git a/README_CN.md b/README_CN.md
@@ -13,7 +13,7 @@
 
 
 <p align="center">
-<img src="https://img.shields.io/badge/python-%E2%89%A53.8-5be.svg">
+<img src="https://img.shields.io/badge/python-3.10-5be.svg">
 <img src="https://img.shields.io/badge/pytorch-%E2%89%A52.0-orange.svg">
 <a href="https://github.com/modelscope/modelscope/"><img src="https://img.shields.io/badge/modelscope-%E2%89%A51.19-5D91D4.svg"></a>
 <a href="https://pypi.org/project/ms-swift/"><img src="https://badge.fury.io/py/ms-swift.svg"></a>
@@ -271,6 +271,16 @@ CUDA_VISIBLE_DEVICES=0 swift infer \
     --max_new_tokens 2048
 ```
 
+### 界面推理
+```shell
+CUDA_VISIBLE_DEVICES=0 swift app \
+    --model Qwen/Qwen2.5-7B-Instruct \
+    --stream true \
+    --infer_backend pt \
+    --max_new_tokens 2048 \
+    --lang zh
+```
+
 ### 部署
 ```shell
 CUDA_VISIBLE_DEVICES=0 swift deploy \

diff --git a/docs/source/GetStarted/Web-UI.md b/docs/source/GetStarted/Web-UI.md
@@ -3,16 +3,16 @@
 目前SWIFT已经支持了界面化的训练和推理，参数支持和脚本训练相同。在安装SWIFT后，使用如下命令：
 
 ```shell
-swift web-ui --host 0.0.0.0 --port 7860 --lang zh/en
+swift web-ui --lang zh/en
 ```
 
 开启界面训练和推理。
 
-目前web-ui额外支持了app-ui模式（即Space部署）：
+目前ms-swift额外支持了界面推理模式（即Space部署）：
 
 ```shell
-swift web-ui --model '<model>' --studio_title My-Awesome-Space
+swift app --model '<model>' --studio_title My-Awesome-Space --stream true
 # 或者
-swift web-ui --model '<model>' --adapters '<adapter>' --studio_title My-Awesome-Space
+swift app --model '<model>' --adapters '<adapter>' --stream true
 ```
 即可启动一个只有推理页面的应用，该应用会在启动时对模型进行部署并提供后续使用。
diff --git a/docs/source/Instruction/ReleaseNote3.0.md b/docs/source/Instruction/ReleaseNote3.0.md
@@ -76,7 +76,7 @@
 2. 整体移除了2.x版本的examples目录，并添加了按功能类型划分的新examples
 3. 数据集格式完全向messages格式兼容，不再支持query/response/history格式
 4. merge_lora的存储目录可以通过`--output_dir`指定了，且merge_lora和量化不能在一个命令中执行，需要最少两个命令
-5. 移除了app-ui界面，并使用`swift web-ui --model xxx`进行替代，并支持了多模态界面部署
+5. 使用`swift app --model xxx`开启app-ui界面，支持了多模态界面推理
 6. 移除了AIGC的依赖以及对应的examples和训练代码
 
 ## 待完成

diff --git a/docs/source/Instruction/命令行参数.md b/docs/source/Instruction/命令行参数.md
@@ -99,6 +99,7 @@
 - remove_unused_columns: 默认值False
 - logging_first_step: 是否记录第一个step的打印，默认值True
 - logging_steps: 日志打印间隔，默认值5
+- average_tokens_across_devices: 是否在设备之间对token数进行平均。如果设置为True，将使用all_reduce同步`num_tokens_in_batch`以进行精确的损失计算。默认为None，如果为分布式训练则设置为True，否则为False
 - metric_for_best_model: 默认为None. 即当`predict_with_generate`设置为False, 则为'loss', 否则设置为'rouge-l'
 - greater_is_better: 默认为None. 即当`metric_for_best_model`含'loss'时, 设置为False, 否则设置为True.
 
@@ -328,6 +329,7 @@ RLHF参数继承于[训练参数](#训练参数)
 
 - 🔥infer_backend: 推理backend，支持'pt'、'vllm'、'lmdeploy'三个推理框架，默认为'pt'
 - 🔥max_batch_size: pt backend的batch_size，默认为1
+- ddp_backend: pt backend使用多卡推理时的分布式后端，默认为None. 多卡推理例子可以查看[这里](https://github.com/modelscope/ms-swift/tree/main/examples/infer/pt)
 - result_path: 推理结果存储路径（jsonl），默认为None，保存在checkpoint目录或者'./result'目录
 - val_dataset_sample: 推理数据集采样数，默认为None
 
@@ -346,6 +348,22 @@ RLHF参数继承于[训练参数](#训练参数)
 - log_interval: tokens/s统计值打印间隔，默认20秒。设置为-1则不打印
 - max_logprobs: 最多返回的logprobs数量，默认为20
 
+
+### Web-UI参数
+- server_name: web-ui的host，默认为'0.0.0.0'
+- server_port: web-ui的port，默认为7860
+- share: 默认为False
+- lang: web-ui的语言，可选为'zh', 'en'。默认为'zh'
+
+
+### App参数
+
+App参数继承于[部署参数](#部署参数), [Web-UI参数](#Web-UI参数)
+- base_url: 模型部署的base_url，例如`http://localhost:8000/v1`。默认为`None`
+- studio_title: studio的标题。默认为None，设置为模型名
+- is_multimodal: 是否启动多模态版本的app。默认为None，自动根据model判断，若无法判断，设置为False
+- lang: 覆盖Web-UI参数，默认为'en'
+
 ### 评测参数
 
 评测参数继承于[部署参数](#部署参数)
@@ -356,7 +374,7 @@ RLHF参数继承于[训练参数](#训练参数)
 - temperature: 默认为0.
 - verbose: 该参数在本地评估时传入DeployArguments中，默认`False`
 - max_batch_size: 最大batch_size，文本评测默认256，多模态默认16
-- 🔥eval_url: 评测url。默认为None，采用本地部署评估
+- 🔥eval_url: 评测url，例如`http://localhost:8000/v1`。默认为None，采用本地部署评估。例子可以查看[这里](https://github.com/modelscope/ms-swift/tree/main/examples/eval/eval_url)
 
 ### 导出参数
 

diff --git a/docs/source/Instruction/推理和部署.md b/docs/source/Instruction/推理和部署.md
@@ -4,7 +4,7 @@ SWIFT支持以命令行、Python代码和界面方式进行推理和部署：
 - 使用`engine.infer`或者`engine.infer_async`进行python的方式推理. 参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo.py).
 - 使用`swift infer`使用命令行的方式进行推理. 参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/infer/cli_demo.sh).
 - 使用`swift deploy`进行服务部署，并使用openai API或者`client.infer`的方式推理. 服务端参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/server), 客户端参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client).
-- 使用`swift web-ui`部署模型进行界面推理, 可以查看[这里](../GetStarted/界面使用.md)
+- 使用`swift app`部署模型进行界面推理, 可以查看[这里](../GetStarted/界面使用.md)
 
 
 ## 命令行推理指令

diff --git a/docs/source_en/GetStarted/Web-UI.md b/docs/source_en/GetStarted/Web-UI.md
@@ -3,16 +3,16 @@
 Currently, SWIFT supports interface-based training and inference, with parameter support similar to script training. After installing SWIFT, use the following command:
 
 ```shell
-swift web-ui --host 0.0.0.0 --port 7860 --lang zh/en
+swift web-ui --lang zh/en
 ```
 
 to start the interface for training and inference.
 
-Additionally, the web-ui now supports app-ui mode (i.e., Space deployment):
+Additionally, ms-swift supports interface inference mode (i.e., Space deployment):
 
 ```shell
-swift web-ui --model '<model>' --studio_title My-Awesome-Space
+swift app --model '<model>' --studio_title My-Awesome-Space --stream true
 # or
-swift web-ui --model '<model>' --adapters '<adapter>' --studio_title My-Awesome-Space
+swift app --model '<model>' --adapters '<adapter>' --studio_title My-Awesome-Space --stream true
 ```
 This will launch an application with only the inference page, which will deploy the model upon startup and provide it for subsequent use.
diff --git a/docs/source_en/Instruction/Command-line-parameters.md b/docs/source_en/Instruction/Command-line-parameters.md
@@ -100,6 +100,7 @@ This parameter list inherits from transformers `Seq2SeqTrainingArguments`, with
 - remove_unused_columns: Default is False.
 - logging_first_step: Whether to log the first step print, default is True.
 - logging_steps: Interval for logging prints, default is 5.
+- average_tokens_across_devices: Whether to average the token count across devices. If set to True, it will use all_reduce to synchronize `num_tokens_in_batch` for accurate loss computation. The default is None; set to True for distributed training, otherwise set to False.
 - metric_for_best_model: Default is None. When `predict_with_generate` is set to False, it is 'loss'; otherwise, it is 'rouge-l'.
 - greater_is_better: Default is None. When `metric_for_best_model` contains 'loss', set to False; otherwise, set to True.
 
@@ -331,6 +332,7 @@ Inference arguments include the [base arguments](#base-arguments), [merge argume
 
 - 🔥infer_backend: Inference backend, supports 'pt', 'vllm', 'lmdeploy', default is 'pt'.
 - 🔥max_batch_size: Batch size for pt backend, default is 1.
+- ddp_backend: The distributed backend for multi-gpu inference using the pt backend, default is None. Examples of multi-card inference can be found [here](https://github.com/modelscope/ms-swift/tree/main/examples/infer/pt).
 - result_path: Path to store inference results (jsonl), default is None, saved in the checkpoint directory or './result' directory.
 - val_dataset_sample: Number of samples from the inference dataset, default is None.
 
@@ -347,6 +349,21 @@ Deployment Arguments inherit from the [inference arguments](#inference-arguments
 - log_interval: Interval for printing tokens/s statistics, default is 20 seconds. If set to -1, it will not be printed.
 - max_logprobs: Maximum number of logprobs to return, default is 20.
 
+### Web-UI Arguments
+- server_name: Host for the web UI, default is '0.0.0.0'.
+- server_port: Port for the web UI, default is 7860.
+- share: Default is False.
+- lang: Language for the web UI, options are 'zh', 'en'. Default is 'zh'.
+
+
+### App Arguments
+App parameters inherit from [deployment arguments](#deployment-arguments) and [Web-UI Arguments](#web-ui-arguments).
+
+- base_url: Base URL for the model deployment, for example, `http://localhost:8000/v1`. Default is None.
+- studio_title: Title of the studio. Default is None, set to the model name.
+- is_multimodal: Whether to launch the multimodal version of the app. Defaults to None, automatically determined based on the model; if it cannot be determined, set to False.
+- lang: Overrides the Web-UI Arguments, default is 'en'.
+
 ### Evaluation Arguments
 
 Evaluation Arguments inherit from the [deployment arguments](#deployment-arguments).
@@ -357,7 +374,7 @@ Evaluation Arguments inherit from the [deployment arguments](#deployment-argumen
 - temperature: Default is 0.
 - verbose: This parameter is passed to DeployArguments during local evaluation, default is `False`.
 - max_batch_size: Maximum batch size, default is 256 for text evaluation, 16 for multimodal.
-- 🔥eval_url: Evaluation URL. Default is None, uses local deployment for evaluation.
+- 🔥eval_url: Evaluation URL, for example `http://localhost:8000/v1`. Default is None, uses local deployment for evaluation. You can view the examples [here](https://github.com/modelscope/ms-swift/tree/main/examples/eval/eval_url).
 
 ### Export Arguments
 

diff --git a/docs/source_en/Instruction/Inference-and-deployment.md b/docs/source_en/Instruction/Inference-and-deployment.md
@@ -4,7 +4,7 @@ SWIFT supports inference and deployment through command line, Python code, and i
 - Use `engine.infer` or `engine.infer_async` for Python-based inference. See [here](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo.py) for reference.
 - Use `swift infer` for command-line-based inference. See [here](https://github.com/modelscope/ms-swift/blob/main/examples/infer/cli_demo.sh) for reference.
 - Use `swift deploy` for service deployment and perform inference using the OpenAI API or `client.infer`. Refer to the server guidelines [here](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/server) and the client guidelines [here](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client).
-- Deploy the model with `swift web-ui` for web-based inference. You can check [here](../GetStarted/Interface-usage.md) for details.
+- Deploy the model with `swift app` for web-based inference. You can check [here](../GetStarted/Interface-usage.md) for details.
 
 
 ## Command Line Inference

diff --git a/docs/source_en/Instruction/ReleaseNote3.0.md b/docs/source_en/Instruction/ReleaseNote3.0.md
@@ -88,7 +88,7 @@ The parameters marked as compatible in version 2.0 have been entirely removed.
 
 4. The storage directory for merge_lora can be specified using `--output_dir`, and merge_lora and quantization cannot be executed in the same command; at least two commands are required.
 
-5. The app-ui interface has been removed, replaced by `swift web-ui --model xxx`, and multi-modal interface deployment is supported.
+5. Use `swift app --model xxx` to launch the app-ui interface, which supports multimodal interface inference.
 
 6. Removed dependencies for AIGC along with corresponding examples and training code.
 

diff --git a/examples/app/base_url/demo.py b/examples/app/base_url/demo.py
@@ -0,0 +1,13 @@
+# Copyright (c) Alibaba, Inc. and its affiliates.
+import os
+
+os.environ['CUDA_VISIBLE_DEVICES'] = '0'
+
+if __name__ == '__main__':
+    from swift.llm import AppArguments, app_main, DeployArguments, run_deploy
+    # Here's a runnable demo provided.
+    # In a real scenario, you can simply remove the deployed context.
+    with run_deploy(
+            DeployArguments(model='Qwen/Qwen2.5-1.5B-Instruct', verbose=False, log_interval=-1, infer_backend='vllm'),
+            return_url=True) as url:
+        app_main(AppArguments(model='Qwen2.5-1.5B-Instruct', base_url=url, stream=True, max_new_tokens=2048))
diff --git a/examples/app/base_url/demo.sh b/examples/app/base_url/demo.sh
@@ -0,0 +1,7 @@
+# You need to have a deployed model or api service first
+CUDA_VISIBLE_DEVICES=0 swift app \
+    --model '<model_name>' \
+    --base_url http://127.0.0.1:8000/v1 \
+    --stream true \
+    --max_new_tokens 2048 \
+    --lang zh
diff --git a/examples/app/llm.sh b/examples/app/llm.sh
@@ -0,0 +1,6 @@
+CUDA_VISIBLE_DEVICES=0 swift app \
+    --model Qwen/Qwen2.5-7B-Instruct \
+    --stream true \
+    --infer_backend pt \
+    --max_new_tokens 2048 \
+    --lang zh
diff --git a/examples/app/mllm.sh b/examples/app/mllm.sh
@@ -0,0 +1,6 @@
+CUDA_VISIBLE_DEVICES=0 swift app \
+    --model Qwen/Qwen2-VL-7B-Instruct \
+    --stream true \
+    --infer_backend pt \
+    --max_new_tokens 2048 \
+    --lang zh
diff --git a/examples/eval/custom/README.md b/examples/eval/custom/README.md
diff --git a/examples/eval/custom/custom_ceval/default_dev.csv b/examples/eval/custom/custom_ceval/default_dev.csv
diff --git a/examples/eval/custom/custom_ceval/default_val.csv b/examples/eval/custom/custom_ceval/default_val.csv
diff --git a/examples/eval/custom/custom_config.json b/examples/eval/custom/custom_config.json
diff --git a/examples/eval/custom/custom_general_qa/default.jsonl b/examples/eval/custom/custom_general_qa/default.jsonl
diff --git a/examples/eval/custom/eval.sh b/examples/eval/custom/eval.sh
diff --git a/examples/eval/eval_url/eval.sh b/examples/eval/eval_url/eval.sh
@@ -1,6 +1,6 @@
 # You need to have a deployed model or api service first
 swift eval \
   --model '<model_name>' \
-  --eval_url http://127.0.0.1:8000/v1/chat/completions \
+  --eval_url http://127.0.0.1:8000/v1 \
   --eval_limit 100 \
   --eval_dataset gsm8k
diff --git a/examples/notebook/qwen2.5-self-cognition/self-cognition-sft.ipynb b/examples/notebook/qwen2.5-self-cognition/self-cognition-sft.ipynb
@@ -10,7 +10,7 @@
     "\n",
     "Are you ready? Let's begin the journey...\n",
     "\n",
-    "中文版：https://modelscope.cn/notebook/share/ipynb/4340fdeb/self-cognition-sft.ipynb"
+    "中文版：https://modelscope.cn/notebook/share/ipynb/313f6116/self-cognition-sft.ipynb"
    ]
   },
   {

diff --git a/swift/cli/app.py b/swift/cli/app.py
@@ -0,0 +1,4 @@
+from swift.llm import app_main
+
+if __name__ == '__main__':
+    app_main()
diff --git a/swift/cli/main.py b/swift/cli/main.py
@@ -5,6 +5,10 @@
 import sys
 from typing import Dict, List, Optional
 
+from swift.utils import get_logger
+
+logger = get_logger()
+
 ROUTE_MAPPING: Dict[str, str] = {
     'pt': 'swift.cli.pt',
     'sft': 'swift.cli.sft',
@@ -14,7 +18,8 @@
     'deploy': 'swift.cli.deploy',
     'rlhf': 'swift.cli.rlhf',
     'export': 'swift.cli.export',
-    'eval': 'swift.cli.eval'
+    'eval': 'swift.cli.eval',
+    'app': 'swift.cli.app',
 }
 
 ROUTE_MAPPING.update({k.replace('-', '_'): v for k, v in ROUTE_MAPPING.items()})
@@ -40,8 +45,17 @@ def get_torchrun_args() -> Optional[List[str]]:
     return torchrun_args
 
 
+def _compat_web_ui(argv):
+    # [compat]
+    method_name = argv[0]
+    if method_name in {'web-ui', 'web_ui'} and ('--model' in argv or '--adapters' in argv or '--ckpt_dir' in argv):
+        argv[0] = 'app'
+        logger.warning('Please use `swift app`.')
+
+
 def cli_main() -> None:
     argv = sys.argv[1:]
+    _compat_web_ui(argv)
     method_name = argv[0]
     argv = argv[1:]
     file_path = importlib.util.find_spec(ROUTE_MAPPING[method_name]).origin
-Original file line number
+Diff line change
@@ Expand Up / @@ -10,7 +10,7 @@ @@
         "\n",
         "Are you ready? Let's begin the journey...\n",
         "\n",
-        "中文版：https://modelscope.cn/notebook/share/ipynb/4340fdeb/self-cognition-sft.ipynb"
+        "中文版：https://modelscope.cn/notebook/share/ipynb/313f6116/self-cognition-sft.ipynb"
        ]
       },
       {
@@ Expand Down @@