-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime detection MacOS #258
Conversation
SignalRT
commented
Nov 6, 2023
•
edited
Loading
edited
- Add MacOS x86_64 in the build.
- Copy different runtimes to the output folder
- Dynamically load the right MacOS library
- Tested on MacOS Arm64
- Tested on MacOS x86_64
Create different paths to different MacOS platforms. Dynamically load the right library
…harp into RuntimeDetection
I guess the https://github.com/SciSharp/LLamaSharp/blob/6334f25627d4a5903e14345fb50fd4cf03389cef/LLama/runtimes/build/LLamaSharp.Backend.Cpu.nuspec# probably need updating to reference this new binary? Also, does the MacMetal nuspec still need to exist? I think we dropped that as a specific backend a while ago? |
@martindevans, this is related with Issue #251 I reorganize the binaries in directories (only the MacOS binaries), Change the dynamic detection, change the build to include Intel binaries, update MacOS binaries to make a test in my branch (https://github.com/SignalRT/LLamaSharp/actions/runs/6777594696) where the Mac test on Intel platform are included. There is no work on Windows / Linux binaries. |
What I was meaning about the nuspec is at the moment it does this: <file src="runtimes/libllama.dylib" target="runtimes\osx-x64\native\libllama.dylib" />
<file src="runtimes/ggml-metal.metal" target="runtimes\osx-x64\native\ggml-metal.metal" />
<file src="runtimes/libllama.dylib" target="runtimes\osx-arm64\native\libllama.dylib" />
<file src="runtimes/ggml-metal.metal" target="runtimes\osx-arm64\native\ggml-metal.metal" /> i.e. the same files on ARM64 and x64. Should that be changed to something like this: <file src="runtimes/macos-x86_64/libllama.dylib" target="runtimes\macos-x86_64\native\libllama.dylib" />
<file src="runtimes/macos-arm64/libllama.dylib" target="runtimes\macos-arm64\native\libllama.dylib" />
<file src="runtimes/macos-arm64/ggml-metal.metal" target="runtimes\macos-arm64\native\ggml-metal.metal" /> Note: I haven't worked with nuspec before, so I'm not sure that's correct! |
@martindevans, @AsakusaRinne I did not change any nuget package because there are several possible directions with these packages. With the dynamic library load it could be only "one" nuget package, as the most extreme solution, a nuget package by OS version, etc, etc. Talking about the Mac use case only, to limit the options, the intel build only supports memory, the arm build supports with the same binary memory and GPU with the configuration of GPU layers so it could have sense to introduce all this package in only ONE nuget package. But this package is not CPU only and not METAL. It could be just MacOS package, it will use the best possible option. Would we keep the current packages?, Would we think about another nuget package distribution (by OS for example)?. Once I have clear if you want to keep the current nuget packages or to group the packages with another layout I can make the changes. |
@SignalRT Thank you a lot for this great work! Each of these options has its own advantages and disadvantages, and we need to make a choice among them. In my opinion, the following options are ranked from high to low priority:
However, my view me be limited. Therefore I explain the reasons of mine and I'm open for suggestions and different voices. The binaries are not of large size, the sum of all binaries is less than 20MB, so that it won't be a burden in most of the cases. Besides, the There's an even more aggressive but not bad option: keep all binaries along with our main library in Splitting by OS is a good compromise. To be honest I'm quite hesitant with the first two options. The risk of it comes from user -- I think the gap of changing package by computation ability to package by OS is larger than that of changing to one package. The last one is related with the risk of option 1 and 2. Changing the package distribution has so many effects, especially on other libraries depending on us, that I'm thinking if we should update the major version to cc @martindevans @saddam213 @xbotter @Oceania2018 @sagilio I'd also like to hear from you, please. |
One option we could consider would be the extreme version SignalRT mentioned where each package has exactly one DLL (e.g. My concern with this approach is some users might try to isntall various different specific packages to debug DLL dependency issues and end up with a bit of a mess. For this immediate release I would vote for keeping it simple - add the MacOS binaries for both architectures into the |
@martindevans When I talk about extreme solution is to have only ONE package (the option 1 that refers @AsakusaRinne ). It's OK for me to change my changes in the Backend.CPU package. In that case I think that the Metal package should dissapear. I will try to make the changes today. |
Oh sorry, I misunderstood what you meant. When I was talking about "extreme" I meant we would have:
So we would be shipping a lot of packages. Users would have the ability to pick one exactly specific backend if they want (by just installing one single package). But they could also just install the I'm not actually sure I if I like this option to be honest, I'm just presenting it as another option to consider! |
Delete Backend.Metal because is not needed anymore. Do not include .metal in x86_64 binaries
…harp into RuntimeDetection
Eliminated Backend.Metal |
I'll agree with that if the number of packages is not so large. However, as you can see, the combinations could reach 20, which may confuse the users who are not familiar with this scope. Besides, if there're other options added to llama.cpp in the future, we'll have to release new packages, which seems to be inconvenience. Anyway, I really thanks for your suggestion. @SignalRT @martindevans Can the feature detection select the best package now? I think stacking binaries in one package is the best if it can. Otherwise we should keep the current distribution. But I'm still open for other voices. |
We can put everything in one package except CUDA, we don't currently have a runtime way to detect that. In theory it could be done in the future, and then we could truly have just one package with all backends. |
@martindevans @SignalRT I think I found a method to detect cuda and avx support. I've pushed my changes in #268, please take a review of it. Since it's already too late for me, it's only a template now and I haven't verified if it works. If it does work, we could have one more choice which is to cherry-pick my commit and support feature detection based on it. The file structure I listed in that PR is only for experiment, please feel easy to change it. BTW, whether we detect cuda and avx or not, we should consider how to provide a convenient way for user to use LLamaSharp with self-compiled DLL. I haven't had a good idea yet. Do you have any idea about it? |
Reverted #268 |
What's left to do before we merge this PR? |
@martindevans To test the nuget package. I will do it only if there is no layout changes before release. If the plan is to release changes on nuget packages better wait to test the final configuration. |
It's up to @AsakusaRinne but I think the plan is to do a release as soon as this is merged? |
Yes, I think we'll not change the distribution of nuget packages and include #244 in the next release. It seems that we didn't reach an agreement and I'll open a vote after investigating the possibility of cuda feature detection. To be honest I'm also not sure about which approach the users like. 😶🌫️ |
@AsakusaRinne I think it is ok. I build the nuget package an test the CPU package on ARM. The test pass OK. I changed the prepare_release.sh script to be able to execute this on osx (there are some pending changes), I hope not to break nothing on linux execution but osx bash seems to have some incompatibilities. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, it's a good work. :) However we'll delay the next release for somewhile for the two reasons:
- I have to test on my fork to check if packages could be generated correctly by the workflow.
- Noticed that this PR removed MacMetal backend, which is a break change. Therefore we'll publish a minor release instead, and include feat: cuda feature detection. #275 if possible.
One more question, why does the MAC ci so slow 🤣 It won't block the merging of this PR though |
It's been waiting for an entire hour on this step!?
|
Not sure if it's because the limited resources of github ci cluster. But anyway, one hour waiting is too long. Is MAC server very rare? 😹 |
@AsakusaRinne ....Occam's razor.... It was a mismatch renaming macOS to osx. |