[WIP][Platform] Add support for Ascend 310P #914

farawayboat · 2025-05-21T08:06:16Z

What this PR does / why we need it?

This PR introduces changes to adapt the pos_encoding_kernels.cpp, utils.h, attention.py, layernorm.py, platform.py, and utils.py files to support Ascend 310P devices.
Specifically, it adjusts the loadSize constant based on the Ascend AI Core version and adds conditional compilation directives for bfloat16_t support.
It also includes modifications to handle specific behaviors required for the Ascend 310P, such as tensor alignment and format casting.
The purpose of these changes is to ensure compatibility and optimal performance on Ascend 310P devices.

Does this PR introduce any user-facing change?

Yes, this PR introduces changes that affect the behavior of the library when running on Ascend 310P devices.
Users of Ascend 310P will see improved performance and compatibility due to the added support and optimizations.

How was this patch tested?

The patch has been tested locally on Ascend 310P hardware to ensure that the changes do not break existing functionality and that the new features work as intended.

ENV information

npu-smi info
+--------------------------------------------------------------------------------------------------------+
| npu-smi 24.1.0.1                                 Version: 24.1.0.1                                     |
+-------------------------------+-----------------+------------------------------------------------------+
| NPU     Name                  | Health          | Power(W)     Temp(C)           Hugepages-Usage(page) |
| Chip    Device                | Bus-Id          | AICore(%)    Memory-Usage(MB)                        |
+===============================+=================+======================================================+
| 1536    310P3                 | OK              | NA           65                0     / 0             |
| 0       0                     | 0000:06:00.0    | 0            1524 / 44280                            |
+-------------------------------+-----------------+------------------------------------------------------+
| 1536    310P3                 | OK              | NA           64                17452 / 17452         |
| 1       1                     | 0000:06:00.0    | 0            36314/ 43693                            |
+===============================+=================+======================================================+
| 1792    310P3                 | OK              | NA           72                17636 / 17636         |
| 0       2                     | 0000:07:00.0    | 95           36982/ 44280                            |
+-------------------------------+-----------------+------------------------------------------------------+
| 1792    310P3                 | OK              | NA           68                0     / 0             |
| 1       3                     | 0000:07:00.0    | 0            1216 / 43693                            |
+===============================+=================+======================================================+
| 2048    310P3                 | OK              | NA           60                0     / 0             |
| 0       4                     | 0000:08:00.0    | 0            1411 / 44280                            |
+-------------------------------+-----------------+------------------------------------------------------+
| 2048    310P3                 | OK              | NA           57                0     / 0             |
| 1       5                     | 0000:08:00.0    | 0            1494 / 43693                            |
+===============================+=================+======================================================+
| 2304    310P3                 | OK              | NA           53                18200 / 18200         |
| 0       6                     | 0000:09:00.0    | 0            37812/ 44280                            |
+-------------------------------+-----------------+------------------------------------------------------+
| 2304    310P3                 | OK              | NA           49                18194 / 18194         |
| 1       7                     | 0000:09:00.0    | 0            37958/ 43693                            |
+===============================+=================+======================================================+

CANN, NNAL version: 8.1.RC1

Important

Because the current PTA 2.5.1 version cannot pass parameters in the NZ format as required when calling NNAL operators on 310P, we used a temporary debugging version provided by the PTA team for testing.

Code example

Build vllm-ascend from source code

# download source code as vllm-ascend
cd vllm-ascend
export SOC_VERSION=Ascend310P3
pip install -v -e .
cd ..

Run offline inference

from vllm import LLM, SamplingParams
prompts = ["水的沸点是100摄氏度吗？请回答是或者否。", "若腋下体温为38摄氏度，请问这人是否发烧？请回答是或者否。",
           "水的沸点是100摄氏度吗？请回答是或者否。", "若腋下体温为38摄氏度，请问这人是否发烧？请回答是或者否。"]

# Create a sampling params object.
sampling_params = SamplingParams(temperature=0.0, top_p=0.95, max_tokens=10)
# Create an LLM.
llm = LLM(
    model="Qwen/Qwen2.5-7B-Instruct",
    max_model_len=4096,
    max_num_seqs=4,
    dtype="float16", # IMPORTANT cause some ATB ops cannot support bf16 on 310P
    disable_custom_all_reduce=True,
    trust_remote_code=True,
    tensor_parallel_size=2,
    compilation_config={"custom_ops":['none', "+rms_norm", "+rotary_embedding"]},
)

# Generate texts from the prompts.
outputs = llm.generate(prompts, sampling_params)
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

wangxiyuan · 2025-05-28T03:29:46Z

vllm_ascend/utils.py

+    return new_tensor
+
+
+def communication_adaptation_310p():


move patch func to vllm_ascend/patch module

Signed-off-by: Vincent Yuan <[email protected]>

github-actions bot added module:ops module:core labels May 21, 2025

wangxiyuan reviewed May 28, 2025

View reviewed changes

farawayboat added 2 commits May 29, 2025 07:44

feat: add support for Ascend 310P

02db065

Signed-off-by: Vincent Yuan <[email protected]>

chore: code format

7f2e564

Signed-off-by: Vincent Yuan <[email protected]>

farawayboat force-pushed the feat-atlas-310p branch from 95be177 to 7f2e564 Compare May 29, 2025 07:49

feat: adapted to the latest patch structure

1d25425

Signed-off-by: Vincent Yuan <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP][Platform] Add support for Ascend 310P #914

[WIP][Platform] Add support for Ascend 310P #914

farawayboat commented May 21, 2025 •

edited

Loading

Uh oh!

wangxiyuan May 28, 2025

Uh oh!

Uh oh!

[WIP][Platform] Add support for Ascend 310P #914

Are you sure you want to change the base?

[WIP][Platform] Add support for Ascend 310P #914

Conversation

farawayboat commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

ENV information

Code example

Build vllm-ascend from source code

Run offline inference

Uh oh!

wangxiyuan May 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

farawayboat commented May 21, 2025 •

edited

Loading