llama cpp build update to correct HIP backend #206

karthikbabuks · 2025-01-31T05:43:48Z

llamma_cpp has update the HIP backend from "-DGGML_HIPBLAS=1" to "-DGGML_HIP=ON". This results in llama_cpp built with CPU backend and does not leverage ROCm GPU backend. This needs to be updated in the build info of llama_cpp.

lamikr · 2025-01-31T07:47:30Z

Thanks for noticing this, I am fixing this now and also updating to latest llama.cpp for now.
I was planning to llama-cli in the weekend with the deepseek r1.

Another thing that would proably be good to do is to check selected list of GPU's have only APU/APUS and no dicrete gpus.
Then the llama.cpp docs recommends enabling the -DGGML_HIP_UMA=ON

lamikr · 2025-01-31T07:53:45Z

Should now be fixed, I checked that the code goes now on cuda files to HIP specific if blocks.
Can you verify?

karthikbabuks · 2025-01-31T15:48:19Z

Yes, it is fine now. Thanks. Regarding the UMA, yes definitely it would be good to do that too. Let me know I can take a crack at it. Again this is an awesome builder and many thanks. I have two other suggestions which could be features on their own.

Just an env addition to support shared lib linking of llama_cpp with llama_cpp_python.
export LLAMA_CPP_LIB_PATH=${ROCM_HOME}/lib64
Support https://github.com/instructlab as one of the extras. It is more or less there and it should be easy to get this added. Let me know your thoughts.. This issue can be closed.

lamikr · 2025-01-31T18:12:40Z

LLAMA_CPP_LIB_PATH
I have not myself played with the llama_cpp_python yet. Should I put LLAMA_CPP_LIB_PATH to env_rocm.sh so it would be enabled on runtime. Or/And to binfo/envsetup.sh so that babs.sh uses on so that it's enabled for all apps on build-time?
instructlab
I can take a look on for that.
UMA support
If you have time to check, that it would be nice. As a pointer, I have implemented a some kind of gpu-filter code in
038_aotriton.binfo as at some point it was possible to build it only for a couple of gpus. Maybe that could be some help on checking whether to set the -DUMA flag or not.

I think the current APU gpu list is: gfx1135, gfx1036, gfx1103, gfx1150 and gfx1151

Do you have any opinion from ollama? I have not tried to integrate it to extras.

lamikr · 2025-02-03T00:35:23Z

@karthikbabuks I integrated the llama-cpp-python, you should be able to get it now

./babs.sh -up
./babs.sh -b binfo/extra/ai_tools.blist

I created a separate issue for instructlab with manual build instructions on #207

karthikbabuks · 2025-02-03T07:48:52Z

@lamikr , Thank you very much. I will check on this. LLAM_CPP_LIB_PATH is used only in runtime and it is not required during build. Thanks for considering instruct lab, let also check and share my experience there. For UMA, I will check on that too.

Regarding Ollama, I have basically used it briefly but do know quite a few who use it. It is easy to get it up and running so at the moment I would say good to have but not a priority unless someone is looking for it.

lamikr closed this as completed in 1c1f4af Jan 31, 2025

lamikr reopened this Jan 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama cpp build update to correct HIP backend #206

llama cpp build update to correct HIP backend #206

karthikbabuks commented Jan 31, 2025

lamikr commented Jan 31, 2025

lamikr commented Jan 31, 2025

karthikbabuks commented Jan 31, 2025

lamikr commented Jan 31, 2025

lamikr commented Feb 3, 2025

karthikbabuks commented Feb 3, 2025 •

edited

Loading

llama cpp build update to correct HIP backend #206

llama cpp build update to correct HIP backend #206

Comments

karthikbabuks commented Jan 31, 2025

lamikr commented Jan 31, 2025

lamikr commented Jan 31, 2025

karthikbabuks commented Jan 31, 2025

lamikr commented Jan 31, 2025

lamikr commented Feb 3, 2025

karthikbabuks commented Feb 3, 2025 • edited Loading

karthikbabuks commented Feb 3, 2025 •

edited

Loading