Skip to content

sycl: Fix and disable more configurations of mul_mat #15151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Rbiessy
Copy link
Collaborator

@Rbiessy Rbiessy commented Aug 7, 2025

Follow up of PR #15092

The previous PR missed to fix some configurations.
This PR fixes one case with oneDNN but disabled another one (type_a=f16,type_b=f32,m=1056,n=1,k=129,bs=[8,3],nr=[4,1],per=[0,2,1,3],v=0). It should be possible to support it with the current oneDNN logic here but I'm not able to find a solution in a timely manner. I don't expect I will be able to work on it myself so feel free to continue this fix.

This should be enough to make the SYCL CI green again. The PR also fixes some compilation warnings.

@Rbiessy Rbiessy requested review from Alcpz and qnixsynapse August 7, 2025 13:38
@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Aug 7, 2025
@qnixsynapse
Copy link
Collaborator

Still failing this case:

[MUL_MAT] NMSE = 0.636762990 > 0.000500000   MUL_MAT(type_a=f16,type_b=f32,m=1057,n=1,k=129,bs=[1,3],nr=[4,1],per=[0,2,1,3],v=0): FAIL

@Rbiessy
Copy link
Collaborator Author

Rbiessy commented Aug 7, 2025

That's weird, I can definitely see this test passing on LunarLake iGPU and on PVC. Would you be able to have a closer look on your side?

@qnixsynapse
Copy link
Collaborator

Yeah double checked. Failing exactly here:

[MUL_MAT] NMSE = 0.667358089 > 0.000500000   MUL_MAT(type_a=f16,type_b=f32,m=1057,n=1,k=129,bs=[1,3],nr=[4,1],per=[0,2,1,3],v=0): FAIL

Unfortunately, I do not have enough time to debug this. And thank you for taking a look into it.

@Rbiessy
Copy link
Collaborator Author

Rbiessy commented Aug 8, 2025

Are you using oneDNN? The other path may still be broken. The next step should be to remove oneMKL and oneMath so we may want to start disabling more configurations when oneDNN is not used if that's the reason why it's failing for you.

@qnixsynapse
Copy link
Collaborator

Yes, it's the default using oneDNN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants