[SYCL-TLA] Integrate FlashAttention fwd/bwd kernels #2341

LuFinch · 2025-11-12T02:18:27Z

This PR moves the sycltla kernels in pytorch/pytorch#167056 into torch-xpu-ops.

This PR is based on #2030. When the build PR merge, I will rebase this PR.

EikanWang

TBH, I cannot quite understand the detailed implementation. I need to take more time to understand the logic.

guangyey · 2025-11-12T03:52:28Z

src/ATen/CMakeLists.txt


 file(GLOB xpu_cpp "xpu/*.cpp")
-file(GLOB xpu_native_cpp "native/xpu/*.cpp" "native/sparse/*.cpp" "native/sparse/xpu/*.cpp" "native/nested/*.cpp" "native/nested/xpu/*.cpp" "native/transformers/*.cpp" "native/quantized/*.cpp")
+file(GLOB xpu_native_cpp "native/xpu/*.cpp" "native/sparse/*.cpp" "native/sparse/xpu/*.cpp" "native/nested/*.cpp" "native/nested/xpu/*.cpp" "native/transformers/*.cpp" "native/quantized/*.cpp" "native/transformers/xpu/flash_attn/*.cpp")


Nit: I think we should install the header file under flash_attn into PyTorch such as line 42

May I know what is the purpose of installing header file?

Give a chance to use them in cpp extension.

@guangyey , I think PyTorch does not expose flash_attn because it is the underlying logic of sdpa, which is exposed as a backend. Meanwhile, I don't believe users invoke the flash_atten of PyTorch because dao/flash_atten is a better choice.

Meanwhile, the namespace of these functions is sycltla. It is weird to let users invoke sycl-tla-specific functions.

liangan1 · 2025-11-13T23:48:32Z

src/ATen/native/transformers/xpu/flash_attn/sycltla/mha_fwd.cpp

+    out = at::empty({batch_size, numhead_qo, seqlen_qo, headsize_vo}, opts);
+  } else if (layout == ATTN_TENSOR_LAYOUT::BSHD) {
+    out = at::empty({batch_size, seqlen_qo, numhead_qo, headsize_vo}, opts)
+              .permute({0, 2, 1, 3});


why need to permute here?

output is inited as BSHD contiguous but the shape should be BHSD in for SDPA. Hence it needs to permute the seqlen and numhead dimension.

LuFinch requested review from EikanWang, Copilot and guangyey and removed request for Copilot November 12, 2025 02:18

LuFinch mentioned this pull request Nov 12, 2025

[xpu][feature] Enable SDPA XPU FlashAttention backend with SYCL-TLA implementation pytorch/pytorch#167057

Open

EikanWang approved these changes Nov 12, 2025

View reviewed changes

guangyey reviewed Nov 12, 2025

View reviewed changes

Copilot AI review requested due to automatic review settings November 13, 2025 05:52

This comment was marked as outdated.

Sign in to view

LuFinch added 2 commits November 12, 2025 21:54

mha fwd/bwd kernel integration

dcb9915

install header

442c445

LuFinch force-pushed the lfq/flash_attention branch from 770035a to 442c445 Compare November 13, 2025 05:55

liangan1 reviewed Nov 13, 2025

View reviewed changes

LuFinch added 3 commits November 13, 2025 22:34

fix build warning

489f8cd

rebase forwardkernel

904e98d

fix CI build error

2eb4cd9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYCL-TLA] Integrate FlashAttention fwd/bwd kernels #2341

[SYCL-TLA] Integrate FlashAttention fwd/bwd kernels #2341

LuFinch commented Nov 12, 2025 •

edited

Loading

Uh oh!

EikanWang left a comment

Uh oh!

guangyey Nov 12, 2025

Uh oh!

LuFinch Nov 12, 2025

Uh oh!

guangyey Nov 12, 2025

Uh oh!

LuFinch Nov 13, 2025

Uh oh!

LuFinch Nov 13, 2025

Uh oh!

EikanWang Nov 13, 2025

Uh oh!

EikanWang Nov 13, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

liangan1 Nov 13, 2025

Uh oh!

LuFinch Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[SYCL-TLA] Integrate FlashAttention fwd/bwd kernels #2341

Are you sure you want to change the base?

[SYCL-TLA] Integrate FlashAttention fwd/bwd kernels #2341

Conversation

LuFinch commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EikanWang left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

LuFinch commented Nov 12, 2025 •

edited

Loading