Skip to content

Commit

Permalink
support swift app (#2792)
Browse files Browse the repository at this point in the history
  • Loading branch information
Jintao-Huang authored Dec 29, 2024
1 parent d811972 commit 1b132f6
Show file tree
Hide file tree
Showing 58 changed files with 527 additions and 251 deletions.
11 changes: 10 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
</p>

<p align="center">
<img src="https://img.shields.io/badge/python-%E2%89%A53.8-5be.svg">
<img src="https://img.shields.io/badge/python-3.10-5be.svg">
<img src="https://img.shields.io/badge/pytorch-%E2%89%A52.0-orange.svg">
<a href="https://github.com/modelscope/modelscope/"><img src="https://img.shields.io/badge/modelscope-%E2%89%A51.19-5D91D4.svg"></a>
<a href="https://pypi.org/project/ms-swift/"><img src="https://badge.fury.io/py/ms-swift.svg"></a>
Expand Down Expand Up @@ -279,6 +279,15 @@ CUDA_VISIBLE_DEVICES=0 swift infer \
--max_new_tokens 2048
```

### Interface Inference
```shell
CUDA_VISIBLE_DEVICES=0 swift app \
--model Qwen/Qwen2.5-7B-Instruct \
--stream true \
--infer_backend pt \
--max_new_tokens 2048
```

### Deployment
```shell
CUDA_VISIBLE_DEVICES=0 swift deploy \
Expand Down
12 changes: 11 additions & 1 deletion README_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@


<p align="center">
<img src="https://img.shields.io/badge/python-%E2%89%A53.8-5be.svg">
<img src="https://img.shields.io/badge/python-3.10-5be.svg">
<img src="https://img.shields.io/badge/pytorch-%E2%89%A52.0-orange.svg">
<a href="https://github.com/modelscope/modelscope/"><img src="https://img.shields.io/badge/modelscope-%E2%89%A51.19-5D91D4.svg"></a>
<a href="https://pypi.org/project/ms-swift/"><img src="https://badge.fury.io/py/ms-swift.svg"></a>
Expand Down Expand Up @@ -271,6 +271,16 @@ CUDA_VISIBLE_DEVICES=0 swift infer \
--max_new_tokens 2048
```

### 界面推理
```shell
CUDA_VISIBLE_DEVICES=0 swift app \
--model Qwen/Qwen2.5-7B-Instruct \
--stream true \
--infer_backend pt \
--max_new_tokens 2048 \
--lang zh
```

### 部署
```shell
CUDA_VISIBLE_DEVICES=0 swift deploy \
Expand Down
8 changes: 4 additions & 4 deletions docs/source/GetStarted/Web-UI.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,16 @@
目前SWIFT已经支持了界面化的训练和推理,参数支持和脚本训练相同。在安装SWIFT后,使用如下命令:

```shell
swift web-ui --host 0.0.0.0 --port 7860 --lang zh/en
swift web-ui --lang zh/en
```

开启界面训练和推理。

目前web-ui额外支持了app-ui模式(即Space部署):
目前ms-swift额外支持了界面推理模式(即Space部署):

```shell
swift web-ui --model '<model>' --studio_title My-Awesome-Space
swift app --model '<model>' --studio_title My-Awesome-Space --stream true
# 或者
swift web-ui --model '<model>' --adapters '<adapter>' --studio_title My-Awesome-Space
swift app --model '<model>' --adapters '<adapter>' --stream true
```
即可启动一个只有推理页面的应用,该应用会在启动时对模型进行部署并提供后续使用。
2 changes: 1 addition & 1 deletion docs/source/Instruction/ReleaseNote3.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@
2. 整体移除了2.x版本的examples目录,并添加了按功能类型划分的新examples
3. 数据集格式完全向messages格式兼容,不再支持query/response/history格式
4. merge_lora的存储目录可以通过`--output_dir`指定了,且merge_lora和量化不能在一个命令中执行,需要最少两个命令
5. 移除了app-ui界面,并使用`swift web-ui --model xxx`进行替代,并支持了多模态界面部署
5. 使用`swift app --model xxx`开启app-ui界面,支持了多模态界面推理
6. 移除了AIGC的依赖以及对应的examples和训练代码

## 待完成
Expand Down
20 changes: 19 additions & 1 deletion docs/source/Instruction/命令行参数.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@
- remove_unused_columns: 默认值False
- logging_first_step: 是否记录第一个step的打印,默认值True
- logging_steps: 日志打印间隔,默认值5
- average_tokens_across_devices: 是否在设备之间对token数进行平均。如果设置为True,将使用all_reduce同步`num_tokens_in_batch`以进行精确的损失计算。默认为None,如果为分布式训练则设置为True,否则为False
- metric_for_best_model: 默认为None. 即当`predict_with_generate`设置为False, 则为'loss', 否则设置为'rouge-l'
- greater_is_better: 默认为None. 即当`metric_for_best_model`含'loss'时, 设置为False, 否则设置为True.

Expand Down Expand Up @@ -328,6 +329,7 @@ RLHF参数继承于[训练参数](#训练参数)

- 🔥infer_backend: 推理backend,支持'pt'、'vllm'、'lmdeploy'三个推理框架,默认为'pt'
- 🔥max_batch_size: pt backend的batch_size,默认为1
- ddp_backend: pt backend使用多卡推理时的分布式后端,默认为None. 多卡推理例子可以查看[这里](https://github.com/modelscope/ms-swift/tree/main/examples/infer/pt)
- result_path: 推理结果存储路径(jsonl),默认为None,保存在checkpoint目录或者'./result'目录
- val_dataset_sample: 推理数据集采样数,默认为None

Expand All @@ -346,6 +348,22 @@ RLHF参数继承于[训练参数](#训练参数)
- log_interval: tokens/s统计值打印间隔,默认20秒。设置为-1则不打印
- max_logprobs: 最多返回的logprobs数量,默认为20


### Web-UI参数
- server_name: web-ui的host,默认为'0.0.0.0'
- server_port: web-ui的port,默认为7860
- share: 默认为False
- lang: web-ui的语言,可选为'zh', 'en'。默认为'zh'


### App参数

App参数继承于[部署参数](#部署参数), [Web-UI参数](#Web-UI参数)
- base_url: 模型部署的base_url,例如`http://localhost:8000/v1`。默认为`None`
- studio_title: studio的标题。默认为None,设置为模型名
- is_multimodal: 是否启动多模态版本的app。默认为None,自动根据model判断,若无法判断,设置为False
- lang: 覆盖Web-UI参数,默认为'en'

### 评测参数

评测参数继承于[部署参数](#部署参数)
Expand All @@ -356,7 +374,7 @@ RLHF参数继承于[训练参数](#训练参数)
- temperature: 默认为0.
- verbose: 该参数在本地评估时传入DeployArguments中,默认`False`
- max_batch_size: 最大batch_size,文本评测默认256,多模态默认16
- 🔥eval_url: 评测url。默认为None,采用本地部署评估
- 🔥eval_url: 评测url,例如`http://localhost:8000/v1`。默认为None,采用本地部署评估。例子可以查看[这里](https://github.com/modelscope/ms-swift/tree/main/examples/eval/eval_url)

### 导出参数

Expand Down
2 changes: 1 addition & 1 deletion docs/source/Instruction/推理和部署.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ SWIFT支持以命令行、Python代码和界面方式进行推理和部署:
- 使用`engine.infer`或者`engine.infer_async`进行python的方式推理. 参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo.py).
- 使用`swift infer`使用命令行的方式进行推理. 参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/infer/cli_demo.sh).
- 使用`swift deploy`进行服务部署,并使用openai API或者`client.infer`的方式推理. 服务端参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/server), 客户端参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client).
- 使用`swift web-ui`部署模型进行界面推理, 可以查看[这里](../GetStarted/界面使用.md)
- 使用`swift app`部署模型进行界面推理, 可以查看[这里](../GetStarted/界面使用.md)


## 命令行推理指令
Expand Down
8 changes: 4 additions & 4 deletions docs/source_en/GetStarted/Web-UI.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,16 @@
Currently, SWIFT supports interface-based training and inference, with parameter support similar to script training. After installing SWIFT, use the following command:

```shell
swift web-ui --host 0.0.0.0 --port 7860 --lang zh/en
swift web-ui --lang zh/en
```

to start the interface for training and inference.

Additionally, the web-ui now supports app-ui mode (i.e., Space deployment):
Additionally, ms-swift supports interface inference mode (i.e., Space deployment):

```shell
swift web-ui --model '<model>' --studio_title My-Awesome-Space
swift app --model '<model>' --studio_title My-Awesome-Space --stream true
# or
swift web-ui --model '<model>' --adapters '<adapter>' --studio_title My-Awesome-Space
swift app --model '<model>' --adapters '<adapter>' --studio_title My-Awesome-Space --stream true
```
This will launch an application with only the inference page, which will deploy the model upon startup and provide it for subsequent use.
19 changes: 18 additions & 1 deletion docs/source_en/Instruction/Command-line-parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ This parameter list inherits from transformers `Seq2SeqTrainingArguments`, with
- remove_unused_columns: Default is False.
- logging_first_step: Whether to log the first step print, default is True.
- logging_steps: Interval for logging prints, default is 5.
- average_tokens_across_devices: Whether to average the token count across devices. If set to True, it will use all_reduce to synchronize `num_tokens_in_batch` for accurate loss computation. The default is None; set to True for distributed training, otherwise set to False.
- metric_for_best_model: Default is None. When `predict_with_generate` is set to False, it is 'loss'; otherwise, it is 'rouge-l'.
- greater_is_better: Default is None. When `metric_for_best_model` contains 'loss', set to False; otherwise, set to True.

Expand Down Expand Up @@ -331,6 +332,7 @@ Inference arguments include the [base arguments](#base-arguments), [merge argume

- 🔥infer_backend: Inference backend, supports 'pt', 'vllm', 'lmdeploy', default is 'pt'.
- 🔥max_batch_size: Batch size for pt backend, default is 1.
- ddp_backend: The distributed backend for multi-gpu inference using the pt backend, default is None. Examples of multi-card inference can be found [here](https://github.com/modelscope/ms-swift/tree/main/examples/infer/pt).
- result_path: Path to store inference results (jsonl), default is None, saved in the checkpoint directory or './result' directory.
- val_dataset_sample: Number of samples from the inference dataset, default is None.

Expand All @@ -347,6 +349,21 @@ Deployment Arguments inherit from the [inference arguments](#inference-arguments
- log_interval: Interval for printing tokens/s statistics, default is 20 seconds. If set to -1, it will not be printed.
- max_logprobs: Maximum number of logprobs to return, default is 20.

### Web-UI Arguments
- server_name: Host for the web UI, default is '0.0.0.0'.
- server_port: Port for the web UI, default is 7860.
- share: Default is False.
- lang: Language for the web UI, options are 'zh', 'en'. Default is 'zh'.


### App Arguments
App parameters inherit from [deployment arguments](#deployment-arguments) and [Web-UI Arguments](#web-ui-arguments).

- base_url: Base URL for the model deployment, for example, `http://localhost:8000/v1`. Default is None.
- studio_title: Title of the studio. Default is None, set to the model name.
- is_multimodal: Whether to launch the multimodal version of the app. Defaults to None, automatically determined based on the model; if it cannot be determined, set to False.
- lang: Overrides the Web-UI Arguments, default is 'en'.

### Evaluation Arguments

Evaluation Arguments inherit from the [deployment arguments](#deployment-arguments).
Expand All @@ -357,7 +374,7 @@ Evaluation Arguments inherit from the [deployment arguments](#deployment-argumen
- temperature: Default is 0.
- verbose: This parameter is passed to DeployArguments during local evaluation, default is `False`.
- max_batch_size: Maximum batch size, default is 256 for text evaluation, 16 for multimodal.
- 🔥eval_url: Evaluation URL. Default is None, uses local deployment for evaluation.
- 🔥eval_url: Evaluation URL, for example `http://localhost:8000/v1`. Default is None, uses local deployment for evaluation. You can view the examples [here](https://github.com/modelscope/ms-swift/tree/main/examples/eval/eval_url).

### Export Arguments

Expand Down
2 changes: 1 addition & 1 deletion docs/source_en/Instruction/Inference-and-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ SWIFT supports inference and deployment through command line, Python code, and i
- Use `engine.infer` or `engine.infer_async` for Python-based inference. See [here](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo.py) for reference.
- Use `swift infer` for command-line-based inference. See [here](https://github.com/modelscope/ms-swift/blob/main/examples/infer/cli_demo.sh) for reference.
- Use `swift deploy` for service deployment and perform inference using the OpenAI API or `client.infer`. Refer to the server guidelines [here](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/server) and the client guidelines [here](https://github.com/modelscope/ms-swift/tree/main/examples/deploy/client).
- Deploy the model with `swift web-ui` for web-based inference. You can check [here](../GetStarted/Interface-usage.md) for details.
- Deploy the model with `swift app` for web-based inference. You can check [here](../GetStarted/Interface-usage.md) for details.


## Command Line Inference
Expand Down
2 changes: 1 addition & 1 deletion docs/source_en/Instruction/ReleaseNote3.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ The parameters marked as compatible in version 2.0 have been entirely removed.

4. The storage directory for merge_lora can be specified using `--output_dir`, and merge_lora and quantization cannot be executed in the same command; at least two commands are required.

5. The app-ui interface has been removed, replaced by `swift web-ui --model xxx`, and multi-modal interface deployment is supported.
5. Use `swift app --model xxx` to launch the app-ui interface, which supports multimodal interface inference.

6. Removed dependencies for AIGC along with corresponding examples and training code.

Expand Down
13 changes: 13 additions & 0 deletions examples/app/base_url/demo.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) Alibaba, Inc. and its affiliates.
import os

os.environ['CUDA_VISIBLE_DEVICES'] = '0'

if __name__ == '__main__':
from swift.llm import AppArguments, app_main, DeployArguments, run_deploy
# Here's a runnable demo provided.
# In a real scenario, you can simply remove the deployed context.
with run_deploy(
DeployArguments(model='Qwen/Qwen2.5-1.5B-Instruct', verbose=False, log_interval=-1, infer_backend='vllm'),
return_url=True) as url:
app_main(AppArguments(model='Qwen2.5-1.5B-Instruct', base_url=url, stream=True, max_new_tokens=2048))
7 changes: 7 additions & 0 deletions examples/app/base_url/demo.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# You need to have a deployed model or api service first
CUDA_VISIBLE_DEVICES=0 swift app \
--model '<model_name>' \
--base_url http://127.0.0.1:8000/v1 \
--stream true \
--max_new_tokens 2048 \
--lang zh
6 changes: 6 additions & 0 deletions examples/app/llm.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
CUDA_VISIBLE_DEVICES=0 swift app \
--model Qwen/Qwen2.5-7B-Instruct \
--stream true \
--infer_backend pt \
--max_new_tokens 2048 \
--lang zh
6 changes: 6 additions & 0 deletions examples/app/mllm.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
CUDA_VISIBLE_DEVICES=0 swift app \
--model Qwen/Qwen2-VL-7B-Instruct \
--stream true \
--infer_backend pt \
--max_new_tokens 2048 \
--lang zh
1 change: 0 additions & 1 deletion examples/eval/custom/README.md

This file was deleted.

5 changes: 0 additions & 5 deletions examples/eval/custom/custom_ceval/default_dev.csv

This file was deleted.

4 changes: 0 additions & 4 deletions examples/eval/custom/custom_ceval/default_val.csv

This file was deleted.

14 changes: 0 additions & 14 deletions examples/eval/custom/custom_config.json

This file was deleted.

3 changes: 0 additions & 3 deletions examples/eval/custom/custom_general_qa/default.jsonl

This file was deleted.

3 changes: 0 additions & 3 deletions examples/eval/custom/eval.sh

This file was deleted.

2 changes: 1 addition & 1 deletion examples/eval/eval_url/eval.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# You need to have a deployed model or api service first
swift eval \
--model '<model_name>' \
--eval_url http://127.0.0.1:8000/v1/chat/completions \
--eval_url http://127.0.0.1:8000/v1 \
--eval_limit 100 \
--eval_dataset gsm8k
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"\n",
"Are you ready? Let's begin the journey...\n",
"\n",
"中文版:https://modelscope.cn/notebook/share/ipynb/4340fdeb/self-cognition-sft.ipynb"
"中文版:https://modelscope.cn/notebook/share/ipynb/313f6116/self-cognition-sft.ipynb"
]
},
{
Expand Down
4 changes: 4 additions & 0 deletions swift/cli/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
from swift.llm import app_main

if __name__ == '__main__':
app_main()
16 changes: 15 additions & 1 deletion swift/cli/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@
import sys
from typing import Dict, List, Optional

from swift.utils import get_logger

logger = get_logger()

ROUTE_MAPPING: Dict[str, str] = {
'pt': 'swift.cli.pt',
'sft': 'swift.cli.sft',
Expand All @@ -14,7 +18,8 @@
'deploy': 'swift.cli.deploy',
'rlhf': 'swift.cli.rlhf',
'export': 'swift.cli.export',
'eval': 'swift.cli.eval'
'eval': 'swift.cli.eval',
'app': 'swift.cli.app',
}

ROUTE_MAPPING.update({k.replace('-', '_'): v for k, v in ROUTE_MAPPING.items()})
Expand All @@ -40,8 +45,17 @@ def get_torchrun_args() -> Optional[List[str]]:
return torchrun_args


def _compat_web_ui(argv):
# [compat]
method_name = argv[0]
if method_name in {'web-ui', 'web_ui'} and ('--model' in argv or '--adapters' in argv or '--ckpt_dir' in argv):
argv[0] = 'app'
logger.warning('Please use `swift app`.')


def cli_main() -> None:
argv = sys.argv[1:]
_compat_web_ui(argv)
method_name = argv[0]
argv = argv[1:]
file_path = importlib.util.find_spec(ROUTE_MAPPING[method_name]).origin
Expand Down
Loading

0 comments on commit 1b132f6

Please sign in to comment.