Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
bb65d89
update
zhulinJulia24 Oct 22, 2025
60c9455
update
zhulinJulia24 Oct 23, 2025
5c1a856
update
zhulinJulia24 Oct 23, 2025
bf52f0c
Update Docker tag from cuda12.4 to cuda12.8
zhulinJulia24 Oct 23, 2025
a777dd5
update
zhulinJulia24 Oct 24, 2025
ba4a2c0
merge main
zhulinJulia24 Oct 24, 2025
084c554
update
zhulinJulia24 Oct 24, 2025
88acab4
update
zhulinJulia24 Oct 24, 2025
d699c3a
update
zhulinJulia24 Oct 28, 2025
537292d
update
zhulinJulia24 Oct 28, 2025
6738ad6
Merge branch 'InternLM:main' into update_failcase
zhulinJulia24 Oct 28, 2025
488be8a
add longtext benchmark into workflow
zhulinJulia24 Oct 28, 2025
064960c
update
zhulinJulia24 Oct 28, 2025
b1c049e
update
zhulinJulia24 Oct 29, 2025
06bfd12
update
zhulinJulia24 Oct 30, 2025
013d3ce
update
zhulinJulia24 Oct 30, 2025
32bd0e7
update
zhulinJulia24 Oct 30, 2025
a1678c4
fix
zhulinJulia24 Oct 30, 2025
f275fae
update
zhulinJulia24 Oct 30, 2025
4dbc839
update
zhulinJulia24 Oct 30, 2025
a8e69e7
update
zhulinJulia24 Oct 30, 2025
aaeda1f
update
zhulinJulia24 Oct 30, 2025
2848f0c
update
zhulinJulia24 Oct 30, 2025
5bf0d03
update
zhulinJulia24 Oct 30, 2025
13e7a69
add ascend config
zhulinJulia24 Oct 31, 2025
59a9830
update
zhulinJulia24 Oct 31, 2025
ab4da9d
update
zhulinJulia24 Oct 31, 2025
4eec89d
Merge branch 'InternLM:main' into update_failcase
zhulinJulia24 Nov 3, 2025
e056eba
update
zhulinJulia24 Nov 3, 2025
06b4a5e
Merge branch 'update_failcase' of https://github.com/zhulinJulia24/lm…
zhulinJulia24 Nov 3, 2025
bf14670
update timeout and ascend config
littlegy Nov 5, 2025
4e714f1
update
zhulinJulia24 Nov 5, 2025
183d5f6
merge main
zhulinJulia24 Nov 5, 2025
15e34c9
update
zhulinJulia24 Nov 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/scripts/action_tools.py
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ def generate_benchmark_report(report_path: str):
for f in csv_files:
df = pd.read_csv(f)
merged_df = pd.concat([merged_df, df], ignore_index=True)
if 'throughput' in backend_subfolder:
if 'throughput' in backend_subfolder or 'longtext' in backend_subfolder:
merged_df = merged_df.sort_values(by=merged_df.columns[1])

grouped_df = merged_df.groupby(merged_df.columns[1])
Expand Down
20 changes: 9 additions & 11 deletions .github/workflows/api_eval.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,8 @@ on:
env:
HOST_PIP_CACHE_DIR: /nvme/github-actions/pip-cache
HOST_LOCALTIME: /usr/share/zoneinfo/Asia/Shanghai
OUTPUT_FOLDER: cuda11.8_dist_${{ github.run_id }}
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
REPORT_DIR: /nvme/qa_test_models/test-reports/${{ github.run_id }}
REPORT_DIR: /nvme/qa_test_models/evaluation-reports/allure_report/${{ github.run_id }}
COV_PARAM: --cov /opt/py3/lib/python3.10/site-packages/lmdeploy
FAIL_CONFIG: '--lf'
TEST_CODE_PATH: /nvme/qa_test_models/test_pkg/lmdeploy/${{ github.run_id }}
Expand All @@ -62,8 +61,8 @@ jobs:
env:
PYTHON_VERSION: ${{ matrix.pyver }}
PLAT_NAME: manylinux2014_x86_64
DOCKER_TAG: cuda12.4
OUTPUT_FOLDER: cuda12.4_dist_${{ github.run_id }}
DOCKER_TAG: cuda12.8
OUTPUT_FOLDER: cuda12.8_dist_${{ github.run_id }}
steps:
- name: Checkout repository
uses: actions/checkout@v3
Expand Down Expand Up @@ -98,18 +97,17 @@ jobs:
matrix:
backend: ${{ fromJSON(inputs.backend || '["turbomind", "pytorch"]')}}
container:
image: openmmlab/lmdeploy:latest-cu12
image: openmmlab/lmdeploy:latest-cu12.8
options: "--gpus=all --ipc=host --user root -e PIP_CACHE_DIR=/root/.cache/pip -e NVIDIA_DISABLE_REQUIRE=1 --pull never"
volumes:
- /nvme/github-actions/pip-cache:/root/.cache/pip
- /nvme/github-actions/packages:/root/packages
- /nvme/github-actions/resources:/root/resources
- /nvme/qa_test_models/evaluation-reports:/root/evaluation-reports
- /nvme/qa_test_models:/nvme/qa_test_models
- /mnt/shared:/mnt/shared
- /mnt/bigdisk:/mnt/bigdisk
- /nvme/huggingface_hub:/nvme/huggingface_hub
- /mnt/121:/mnt/121
- /usr/share/zoneinfo/Asia/Shanghai:/etc/localtime:ro
- /mnt/187:/mnt/187
steps:
- name: Create and change to _wk directory
run: |
Expand Down Expand Up @@ -138,9 +136,8 @@ jobs:
run: |
python3 -m pip list
lmdeploy check_env
rm -rf allure-results
mkdir -p ${{ env.REPORT_DIR }}/.pytest_cache
ln -s ${{ env.REPORT_DIR }}/.pytest_cache autotest
mkdir ${{env.REPORT_DIR}} -p
echo "starttime=$(date +%s)" > ${{env.REPORT_DIR}}/status.txt
- name: Setup paths for evaluation
if: (matrix.backend == 'pytorch' || matrix.backend == 'turbomind')
run: |
Expand All @@ -161,5 +158,6 @@ jobs:
- name: Clear workspace
if: always()
run: |
echo "status=done" >> ${{env.REPORT_DIR}}/status.txt
export workdir=$(pwd)
rm -rf $workdir/*
8 changes: 4 additions & 4 deletions .github/workflows/api_eval_h800.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ env:
HOST_LOCALTIME: /usr/share/zoneinfo/Asia/Shanghai
OUTPUT_FOLDER: cuda12.8_dist_${{ github.run_id }}
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
REPORT_DIR: /nvme/qa_test_models/test-reports/${{ github.run_id }}
REPORT_DIR: /nvme/qa_test_models/evaluation-reports/allure_report/${{ github.run_id }}
COV_PARAM: --cov /opt/py3/lib/python3.10/site-packages/lmdeploy
FAIL_CONFIG: '--lf'
TEST_CODE_PATH: /nvme/qa_test_models/test_pkg/lmdeploy/${{ github.run_id }}
Expand Down Expand Up @@ -142,9 +142,8 @@ jobs:
run: |
python3 -m pip list
lmdeploy check_env
rm -rf allure-results
mkdir -p ${{ env.REPORT_DIR }}/.pytest_cache
ln -s ${{ env.REPORT_DIR }}/.pytest_cache autotest
mkdir ${{env.REPORT_DIR}} -p
echo "starttime=$(date +%s)" > ${{env.REPORT_DIR}}/status.txt
- name: Setup paths for evaluation
if: (matrix.backend == 'pytorch' || matrix.backend == 'turbomind')
run: |
Expand All @@ -165,5 +164,6 @@ jobs:
- name: Clear workspace
if: always()
run: |
echo "status=done" >> ${{env.REPORT_DIR}}/status.txt
export workdir=$(pwd)
rm -rf $workdir/*
94 changes: 64 additions & 30 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ on:
default: 'main'
benchmark_type:
required: true
description: 'Set benchmark type. Default is "["generation", "throughput", "api_server"]"'
description: 'Set benchmark type. Default is "["longtext", "throughput", "api_server"]"'
type: string
default: "['apiserver', 'throughput']"
default: "['apiserver', 'throughput', 'longtext']"
offline_mode:
required: true
description: 'Whether start a offline mode, if true, you should prepare code and whl package by yourself'
Expand All @@ -27,10 +27,12 @@ on:
env:
HOST_PIP_CACHE_DIR: /nvme/github-actions/pip-cache
HOST_LOCALTIME: /usr/share/zoneinfo/Asia/Shanghai
OUTPUT_FOLDER: cuda12.4_dist_${{ github.run_id }}
OUTPUT_FOLDER: cuda12.8_dist_${{ github.run_id }}
REPORT_DIR: /nvme/qa_test_models/benchmark-reports/${{ github.run_id }}
ALLURE_REPORT_DIR: /nvme/qa_test_models/benchmark-reports/allure_report/${{ github.run_id }}
TEST_CODE_PATH: /nvme/qa_test_models/test_pkg/lmdeploy/${{ github.run_id }}
OFFLINE_CODE_PATH: /nvme/qa_test_models/offline_pkg/lmdeploy
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
FAIL_CONFIG: ${{ github.run_attempt != 1 && '--lf --lfnf none' || '--lf'}}

jobs:
linux-build:
Expand All @@ -42,7 +44,7 @@ jobs:
env:
PYTHON_VERSION: ${{ matrix.pyver }}
PLAT_NAME: manylinux2014_x86_64
DOCKER_TAG: cuda12.4
DOCKER_TAG: cuda12.8
steps:
- name: Checkout repository
uses: actions/checkout@v3
Expand All @@ -67,25 +69,16 @@ jobs:
retention-days: 1
name: my-artifact-${{ github.run_id }}-${{ matrix.pyver }}


benchmark:
download_pkgs:
needs: linux-build
if: ${{github.event_name == 'schedule' || !cancelled()}}
if: ${{!cancelled()}}
runs-on: [self-hosted, linux-a100]
strategy:
fail-fast: false
matrix:
benchmark_type: ${{fromJSON(github.event.inputs.benchmark_type)}}
timeout-minutes: 480
timeout-minutes: 50
container:
image: openmmlab/lmdeploy:latest-cu12.8
options: "--gpus=all --ipc=host --user root -e PIP_CACHE_DIR=/root/.cache/pip -e NVIDIA_DISABLE_REQUIRE=1 --pull never"
volumes:
- /nvme/github-actions/pip-cache:/root/.cache/pip
- /nvme/github-actions/packages:/root/packages
- /nvme/qa_test_models:/nvme/qa_test_models
- /mnt/shared:/mnt/shared
- /mnt/bigdisk:/mnt/bigdisk
- /usr/share/zoneinfo/Asia/Shanghai:/etc/localtime:ro
steps:
- name: Clone repository
Expand All @@ -94,42 +87,82 @@ jobs:
with:
repository: ${{ github.event.inputs.repo_org || 'InternLM/lmdeploy' }}
ref: ${{github.event.inputs.repo_ref || 'main'}}
- name: Copy repository
if: ${{github.event_name == 'schedule' || !inputs.offline_mode}}
run: rm -rf ${{env.TEST_CODE_PATH}} && mkdir ${{env.TEST_CODE_PATH}} && cp -r . ${{env.TEST_CODE_PATH}}
- name: Copy repository - offline
if: ${{inputs.offline_mode}}
run: cp -r /nvme/qa_test_models/offline_pkg/lmdeploy/. .
run: rm -rf ${{env.TEST_CODE_PATH}} && mkdir ${{env.TEST_CODE_PATH}} && cp -r ${{env.OFFLINE_CODE_PATH}}/. ${{env.TEST_CODE_PATH}}
- name: Download Artifacts
if: ${{github.event_name == 'schedule' || !inputs.offline_mode}}
uses: actions/download-artifact@v4
with:
name: my-artifact-${{ github.run_id }}-py310
- name: Copy Artifacts
if: ${{github.event_name == 'schedule' || !inputs.offline_mode}}
run: rm ${{env.TEST_CODE_PATH}}/lmdeploy-*.whl -f && cp lmdeploy-*.whl ${{env.TEST_CODE_PATH}}
- name: Copy Artifacts - offline
if: ${{inputs.offline_mode}}
run: rm ${{env.TEST_CODE_PATH}}/lmdeploy-*.whl -f && cp ${{env.OFFLINE_CODE_PATH}}/lmdeploy-*.whl ${{env.TEST_CODE_PATH}}
- name: Mark as start
run: |
mkdir ${{env.REPORT_DIR}} -p
echo "starttime=$(date +%s)" > ${{env.REPORT_DIR}}/status.txt

benchmark:
needs: download_pkgs
if: ${{github.event_name == 'schedule' || !cancelled()}}
runs-on: [self-hosted, linux-a100]
strategy:
fail-fast: false
matrix:
benchmark_type: ${{fromJSON(github.event.inputs.benchmark_type)}}
gpu_num: ['gpu_num_1', 'gpu_num_2', 'gpu_num_4', 'gpu_num_8']
include:
- n: 8
gpu_num: gpu_num_1
- n: 4
gpu_num: gpu_num_2
- n: 2
gpu_num: gpu_num_4
- n: 1
gpu_num: gpu_num_8
timeout-minutes: 480
container:
image: openmmlab/lmdeploy:latest-cu12
options: "--gpus=all --ipc=host --user root -e PIP_CACHE_DIR=/root/.cache/pip -e NVIDIA_DISABLE_REQUIRE=1 --pull never"
volumes:
- /nvme/github-actions/pip-cache:/root/.cache/pip
- /nvme/github-actions/packages:/root/packages
- /nvme/qa_test_models:/nvme/qa_test_models
- /nvme/huggingface_hub:/nvme/huggingface_hub
- /mnt/121:/mnt/121
- /mnt/bigdisk:/mnt/bigdisk
- /usr/share/zoneinfo/Asia/Shanghai:/etc/localtime:ro
steps:
- name: Copy repository and Artifacts
run: |
cp -r ${{env.TEST_CODE_PATH}}/. .
mkdir ${{env.REPORT_DIR}} -p
echo "starttime=$(date +%s)" > ${{env.REPORT_DIR}}/status.txt
- name: Install lmdeploy - dependency
run: |
python3 -m pip install -r /nvme/qa_test_models/offline_pkg/requirements.txt
- name: Install lmdeploy
if: ${{github.event_name == 'schedule' || !inputs.offline_mode}}
run: |
python3 -m pip uninstall lmdeploy -y && python3 -m pip install lmdeploy-*.whl --no-deps
python3 -m pip install -r requirements/test.txt
- name: Install lmdeploy - offline
if: ${{inputs.offline_mode}}
run: |
python3 -m pip install /nvme/qa_test_models/offline_pkg/py310/lmdeploy-*.whl --no-deps
python3 -m pip install -r requirements/test.txt
- name: Check env
run: |
python3 -m pip list
lmdeploy check_env
mkdir ${{env.REPORT_DIR}}/allure-results/.pytest_cache -p
ln -s ${{env.REPORT_DIR}}/allure-results/.pytest_cache autotest
- name: Run other benchmark
run: |
pytest autotest/benchmark/test_${{matrix.benchmark_type}}_performance.py -n 8 --run_id ${{ github.run_id }} -m 'gpu_num_1 and not pr_test' ${{env.FAIL_CONFIG}} --alluredir=${{env.REPORT_DIR}}/allure-results || true
pytest autotest/benchmark/test_${{matrix.benchmark_type}}_performance.py -n 4 --run_id ${{ github.run_id }} -m 'gpu_num_2 and not pr_test' ${{env.FAIL_CONFIG}} --alluredir=${{env.REPORT_DIR}}/allure-results || true
pytest autotest/benchmark/test_${{matrix.benchmark_type}}_performance.py -n 2 --run_id ${{ github.run_id }} -m 'gpu_num_4 and not pr_test' ${{env.FAIL_CONFIG}} --alluredir=${{env.REPORT_DIR}}/allure-results || true
pytest autotest/benchmark/test_${{matrix.benchmark_type}}_performance.py --run_id ${{ github.run_id }} -m 'gpu_num_8 and not pr_test' ${{env.FAIL_CONFIG}} --alluredir=${{env.REPORT_DIR}}/allure-results
pytest autotest/benchmark/test_${{matrix.benchmark_type}}_performance.py -n ${{matrix.n}} --run_id ${{ github.run_id }} -m '${{matrix.gpu_num}} and not pr_test' --alluredir=${{env.ALLURE_REPORT_DIR}}
- name: Clear workfile
if: always()
run: |
echo "status=done" >> ${{env.REPORT_DIR}}/status.txt
chmod -R 777 $REPORT_DIR
export workdir=$(pwd)
cd ..
Expand Down Expand Up @@ -157,5 +190,6 @@ jobs:
ref: ${{github.event.inputs.repo_ref || 'main'}}
- name: Get overview
run: |
echo "status=done" >> ${{env.REPORT_DIR}}/status.txt
pip install pandas fire mmengine
python3 .github/scripts/action_tools.py generate_benchmark_report $REPORT_DIR
Loading