Skip to content

Conversation

IMbackK
Copy link
Collaborator

@IMbackK IMbackK commented Sep 28, 2025

No description provided.

@IMbackK IMbackK requested a review from CISC as a code owner September 28, 2025 13:22
@github-actions github-actions bot added the devops improvements to build systems and github actions label Sep 28, 2025
@IMbackK
Copy link
Collaborator Author

IMbackK commented Sep 28, 2025

We require rocwmma to be properly installed for #16221 see #16221 (comment)

Unfortunately i dont have a windows environment to work in, and have been having a hard time getting the rocwmma buildsystem to work on the windows ci, see cb59db2 for the attempt.
For this reason i have disabled rocwmma on the windows builds for now.

@slaren
Copy link
Member

slaren commented Sep 28, 2025

rocwmma is also used in the binary releases in release.yml and rocm.Dockerfile. However, if I understand correctly, this would reduce performance, and is only necessary to workaround an issue with ROCM 7, which is still in preview, and is not used in the binary releases either.

@IMbackK
Copy link
Collaborator Author

IMbackK commented Sep 28, 2025

rocm 7.0 has been released. disabeling rocwmma on windows will reduce performance yes, but the way rocwmma was used on windows previously was just plain wrong, you cant use the library from its source directory without installing it, even though it is header implemented.

@slaren
Copy link
Member

slaren commented Sep 28, 2025

Ok, so I guess the question for the binary releases is whether using ROCM 7 or rocwmma is more important.

@IMbackK
Copy link
Collaborator Author

IMbackK commented Sep 28, 2025

ideally someone with a windows machine would come and fix the install - but as i say i dont have such a machine.

@IMbackK
Copy link
Collaborator Author

IMbackK commented Sep 28, 2025

Ok, so I guess the question for the binary releases is whether using ROCM 7 or rocwmma is more important.

the changes in the pr #16221 make llamacpp build against rocm 7 at all, and break the windows ci (beacuse the rocwmma installation there is incomplete). So this is not just about the binary releases.

@slaren
Copy link
Member

slaren commented Sep 28, 2025

Ok, but we still cannot break or degrade the performance of the windows binaries just for the sake of supporting ROCM 7. This needs to be fixed.

@IMbackK
Copy link
Collaborator Author

IMbackK commented Sep 28, 2025

I find this a bit difficult i explicitly merged rocwmma support with it disabled by default, which it still is, as i knew that supporting it across architectures and os's is quite difficult as its very unstable - which has proven true often, with rdna4 being broken on rocwmma < 2 and > 1.4 and CDNA being broken on rocwmma 2.0 and rocwmma having no official windows support at all with it being excluded from windows distributions of rocm.

Others then decided to enable rocwmma on ci, i would not have done so.

@IMbackK
Copy link
Collaborator Author

IMbackK commented Sep 28, 2025

I will give it another try, but ultimately this should not block #16221

@slaren
Copy link
Member

slaren commented Sep 28, 2025

I have been giving it a try, but it seems more trouble than it is worth. Considering that this is a header-only library and the only thing that the cmake script does (for us) is generate rocwmma-version.hpp, maybe there is some simpler workaround to determine the version.

The rocmma version seems to be solely stored in CMakeLists.txt, so parsing this file as such should do it:

$env:ROCWMMA_VERSION = if ((Get-Content -Path "CMakeLists.txt" -Raw) -match 'set \( VERSION_STRING\s+"?([0-9.]+)[^0-9.]+') { $matches[1] }

@IMbackK
Copy link
Collaborator Author

IMbackK commented Sep 28, 2025

I have been giving it a try, but it seems more trouble than it is worth. Considering that this is a header-only library and the only thing that the cmake script does (for us) is generate rocwmma-version.hpp, maybe there is some simpler workaround to determine the version.

The rocmma version seems to be solely stored in CMakeLists.txt, so parsing this file as such should do it:

$env:ROCWMMA_VERSION = if ((Get-Content -Path "CMakeLists.txt" -Raw) -match 'set \( VERSION_STRING\s+"?([0-9.]+)[^0-9.]+') { $matches[1] }

I dont really like this hack much at all, even though this solves the intimidate problem, ultimately what we are doing here on ci is just plain wrong and there is no guarantee that problem remains confined to the missing rocwmma-version.hpp generation as rocwmma evolves.

IMO we should just accept that rocwmma is not supported on windows by amd atm, disable it for now and then petition amd to support rocwmma on windows and add it to the rocm distribution.

@slaren
Copy link
Member

slaren commented Sep 28, 2025

IMO we cannot make suboptimal releases for Windows just because it is not convenient, because otherwise the result will be that people will be forced to look for the releases elsewhere. It would fracture the community and damage the reputation of llama.cpp. I don't think it is too concerning to add this hack to the HIP backend, and in any case my understanding is that @JohannesGaessler plans to replace WMMA entirely with mma, so at worst it would be temporary.

@JohannesGaessler
Copy link
Collaborator

JohannesGaessler commented Sep 28, 2025

I have already purchased and installed an RX 9060 XT, I've also ordered a MI100 that is set to arrive in a few weeks. I intend to add support for the AMD "WMMA" instructions to the kernel in fattn-mma-f16.cuh. To be clear, the naming is confusing since AMD has "WMMA" instructions and NVIDIA has a "WMMA" interface (which is bad and I therefore want to phase it out). The end result will likely be that rocWMMA (which enables the use of NVIDIA's WMMA interface on AMD) will be obsolete on RDNA.

On CDNA (MI100) there are "MFMA" instructions which work a bit differently but the end result should be the same as with the AMD WMMA instructions.

So yes, if you give me a few weeks I will likely be able to make it so that rocWMMA can be removed entirely. The WMMA kernel will still be needed for V100s, I'm in the process of purchasing one of those for development. I don't think it would make sense to maintain the WMMA kernel only for MUSA.

@IMbackK
Copy link
Collaborator Author

IMbackK commented Sep 30, 2025

@JohannesGaessler neat that you also acquired a mi100, i look forward to collaborating on that :)

@IMbackK
Copy link
Collaborator Author

IMbackK commented Sep 30, 2025

@slaren Ok how about this (see latest change): lets just unpack the ubuntu .deb pacakge on windows, that way we get a pre-built rocwmma with correct version header on the windows ci. Thats still a hack, but at least this way the hack is contained to just the ci.

@IMbackK
Copy link
Collaborator Author

IMbackK commented Sep 30, 2025

I will still petition amd to support rocwmma on windows

@slaren
Copy link
Member

slaren commented Sep 30, 2025

The ubuntu packages are not likely to be updated. The reason we are using the develop branch in the releases is that it is necessary to support strix halo, as none of the tagged versions support it yet (or at least that was true a couple of weeks ago).

@IMbackK
Copy link
Collaborator Author

IMbackK commented Sep 30, 2025

@slaren that is incorrect, the version previously used on windows on ci was git clone https://github.com/rocm/rocwmma --branch rocm-${{ env.ROCM_VERSION }} with env.ROCM_VERSION == 6.4.2 this version of rocwmma did not support strix halo nor gfx12.
The debian pacakge is exactly equvialent in version.

We can use rocwmma from rocm 7.0 which is released and which dose include support for strix halo and gfx12, but which is broken on cdna.

@slaren
Copy link
Member

slaren commented Sep 30, 2025

That is the CI, but the releases are built separately in release.yml.

@slaren
Copy link
Member

slaren commented Sep 30, 2025

We can use rocwmma from rocm 7.0

If that's available for Windows that could work, but I don't see it in https://www.amd.com/en/developer/resources/rocm-hub/hip-sdk.html

@IMbackK
Copy link
Collaborator Author

IMbackK commented Sep 30, 2025

no version of rocwmma is available for windows its not supported on windows at all, however we can use the ubuntu 7.0 package.
Since we dont build the release for cdna this will work.

on windows we now windows install rocwmma from ubuntu pacakges
@IMbackK IMbackK requested a review from slaren as a code owner October 1, 2025 09:20
@IMbackK
Copy link
Collaborator Author

IMbackK commented Oct 1, 2025

@slaren i believe its in a state ready to be merged now.
Im unsure if we have a method to test the release workflow - sofar this has not been done.

@CISC
Copy link
Collaborator

CISC commented Oct 1, 2025

@slaren i believe its in a state ready to be merged now. Im unsure if we have a method to test the release workflow - sofar this has not been done.

You have to test on your own fork's master branch.

@IMbackK
Copy link
Collaborator Author

IMbackK commented Oct 1, 2025

That worked, so The releas workflow is also tested now.

@slaren
Copy link
Member

slaren commented Oct 1, 2025

rocwmma is also used in the docker container

RUN git clone https://github.com/rocm/rocwmma --branch develop --depth 1

@IMbackK
Copy link
Collaborator Author

IMbackK commented Oct 1, 2025

OK, this should be sovable by simply updateing the rocm we build the docker image against to Version 7.0.

In the future, if wie are completely unwilling to accept regressions in released versions we can not fly this close to the sun.

You where building for an archtecture not supported by the rocm version used (6.4+gfx1152) using the development branch of a notoriously unstable library (not even pinned to a commit) that is fairly tightly coupled to the underlieing platform on an os that this library is not supported on.

That's just way to close to the sun to then insist on stability.

@slaren
Copy link
Member

slaren commented Oct 1, 2025

Did you mean gfx1151? As far as I can tell, that is officially supported by ROCm on Windows.

Generally, it is the responsibility of whoever wants to merge a change to fix anything broken by that change. I know it can be frustrating to have to fix things that you don't care about, but we can't just leave things knowingly in a broken state.

@IMbackK
Copy link
Collaborator Author

IMbackK commented Oct 1, 2025

it is not for the linux docker build where we also build for it.

Its not about fixing things that i don't care about, its about maintaining unsupported combinations of dependencies. On cuda this would not even be a question. If cuda dosent support X on Y we would not go around building development branches of things (since its closed source) to make it work. While we can do that sort of thing with rocm we absolutely should not if the goal is being stable and regression free.

Its also about the understanding that the rocwmma code be merged under the condition that i accept maintaining it which i in turn i accepted only with it disabled by default and then others going around that decision by enabeling it on CI. Which is fine as long as we keep the understanding that rocwmma working or not shal not be a release blocker.

@IMbackK IMbackK requested a review from ngxson as a code owner October 1, 2025 17:36
@IMbackK
Copy link
Collaborator Author

IMbackK commented Oct 1, 2025

The docker build works fine when built against 7.0. Note that i disabled the cdna targets for now as we have a chicken and egg problem here with the other pr. After the other pr gets merged i will open another pr to re-enable the cdna targets.

@slaren slaren merged commit 1fe4e38 into ggml-org:master Oct 1, 2025
54 of 57 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops improvements to build systems and github actions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants