[Examples] [Xe] Improve performance for some upconversion cases in xe_gemm #605

petercad · 2025-11-03T18:08:23Z

Using smaller B loads reduces the amount of upconversion/reorder work, which can improve performance when that is expensive.

This PR updates the xe_gemm example to illustrate this, using 64x32 subgroup tiles instead of 32x64 subgroup tiles for such cases.

Antonyvance · 2025-11-04T20:29:08Z

@petercad Can you quote the observed performance improvement as well?

petercad · 2025-11-05T00:46:04Z

@petercad Can you quote the observed performance improvement as well?

Here are some improved cases (BMG 160EU @ 2.85GHz, m = 2560, n = k = 4096):

data types	layouts	TF/s before	TF/s after
u4 x u4	RxR	207	274
f16 x e4m3	RxR	70.5	94.5
f16 x u8	RxR	96.5	102.3
bf16 x s4	RxC	98.8	101

…_gemm

tdeng5 approved these changes Nov 4, 2025

View reviewed changes

tdeng5 added the release label Nov 4, 2025

petercad force-pushed the petercad/xe_gemm_4x8 branch from 3e2c654 to 2d46ba1 Compare November 5, 2025 00:40

[Examples] [Xe] Improve performance for some upconversion cases in xe…

27993fa

…_gemm

petercad force-pushed the petercad/xe_gemm_4x8 branch from 2d46ba1 to 27993fa Compare November 5, 2025 00:49

Antonyvance approved these changes Nov 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Examples] [Xe] Improve performance for some upconversion cases in xe_gemm #605

[Examples] [Xe] Improve performance for some upconversion cases in xe_gemm #605

petercad commented Nov 3, 2025

Uh oh!

Antonyvance commented Nov 4, 2025

Uh oh!

petercad commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Examples] [Xe] Improve performance for some upconversion cases in xe_gemm #605

Are you sure you want to change the base?

[Examples] [Xe] Improve performance for some upconversion cases in xe_gemm #605

Conversation

petercad commented Nov 3, 2025

Uh oh!

Antonyvance commented Nov 4, 2025

Uh oh!

petercad commented Nov 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants