Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama cpp build update to correct HIP backend #206

Open
karthikbabuks opened this issue Jan 31, 2025 · 6 comments
Open

llama cpp build update to correct HIP backend #206

karthikbabuks opened this issue Jan 31, 2025 · 6 comments

Comments

@karthikbabuks
Copy link

llamma_cpp has update the HIP backend from "-DGGML_HIPBLAS=1" to "-DGGML_HIP=ON". This results in llama_cpp built with CPU backend and does not leverage ROCm GPU backend. This needs to be updated in the build info of llama_cpp.

@lamikr
Copy link
Owner

lamikr commented Jan 31, 2025

Thanks for noticing this, I am fixing this now and also updating to latest llama.cpp for now.
I was planning to llama-cli in the weekend with the deepseek r1.

Another thing that would proably be good to do is to check selected list of GPU's have only APU/APUS and no dicrete gpus.
Then the llama.cpp docs recommends enabling the -DGGML_HIP_UMA=ON

@lamikr lamikr closed this as completed in 1c1f4af Jan 31, 2025
@lamikr
Copy link
Owner

lamikr commented Jan 31, 2025

Should now be fixed, I checked that the code goes now on cuda files to HIP specific if blocks.
Can you verify?

@lamikr lamikr reopened this Jan 31, 2025
@karthikbabuks
Copy link
Author

Yes, it is fine now. Thanks. Regarding the UMA, yes definitely it would be good to do that too. Let me know I can take a crack at it. Again this is an awesome builder and many thanks. I have two other suggestions which could be features on their own.

  • Just an env addition to support shared lib linking of llama_cpp with llama_cpp_python.
    export LLAMA_CPP_LIB_PATH=${ROCM_HOME}/lib64

  • Support https://github.com/instructlab as one of the extras. It is more or less there and it should be easy to get this added. Let me know your thoughts.. This issue can be closed.

@lamikr
Copy link
Owner

lamikr commented Jan 31, 2025

  • LLAMA_CPP_LIB_PATH
    I have not myself played with the llama_cpp_python yet. Should I put LLAMA_CPP_LIB_PATH to env_rocm.sh so it would be enabled on runtime. Or/And to binfo/envsetup.sh so that babs.sh uses on so that it's enabled for all apps on build-time?

  • instructlab
    I can take a look on for that.

  • UMA support
    If you have time to check, that it would be nice. As a pointer, I have implemented a some kind of gpu-filter code in
    038_aotriton.binfo as at some point it was possible to build it only for a couple of gpus. Maybe that could be some help on checking whether to set the -DUMA flag or not.

I think the current APU gpu list is: gfx1135, gfx1036, gfx1103, gfx1150 and gfx1151

  • Do you have any opinion from ollama? I have not tried to integrate it to extras.

@lamikr
Copy link
Owner

lamikr commented Feb 3, 2025

@karthikbabuks I integrated the llama-cpp-python, you should be able to get it now

./babs.sh -up
./babs.sh -b binfo/extra/ai_tools.blist

I created a separate issue for instructlab with manual build instructions on #207

@karthikbabuks
Copy link
Author

karthikbabuks commented Feb 3, 2025

@lamikr , Thank you very much. I will check on this. LLAM_CPP_LIB_PATH is used only in runtime and it is not required during build. Thanks for considering instruct lab, let also check and share my experience there. For UMA, I will check on that too.

Regarding Ollama, I have basically used it briefly but do know quite a few who use it. It is easy to get it up and running so at the moment I would say good to have but not a priority unless someone is looking for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants