You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to run an llm locally using openllm, and phi3:3.8b-ggml-q4 happens to be the only model which I am able to run locally according to openllm, so I ran openllm run phi3:3.8b-ggml-q4, which failed (logs are attached).
The failure happens during a cmake build process, as it is unable to find FOUNDATION_LIBRARY (line 116 in the logs). I looked into what this library provides and ended up in ggerganov/llama.cpp
However I assume GGML_METAL should not be set, as I'm running on a Linux (well, Windows10 + WSL2) system and an nvidia gpu. My attempts at disabling the Metal build were not successful, command failed with the same error as before.
On the same machine, but on Windows, openllm run phi3:3.8b-ggml-q4 fails on the same cmake build with the same error. Curiously, openllm hello does not recognize this model as locally runnable as it did on WSL, but I did not investigate why
Please let me know if I should raise this issue in ggerganov/llama.cpp instead
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
- `transformers` version: 4.44.0
- Platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.24.5
- Safetensors version: 0.4.4
- Accelerate version: not installed
- Accelerate config: not found
- PyTorch version (GPU?): not installed (NA)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: <fill in>
Hi. Thanks for contributing. For now llamacpp models in openllm can only be deployed to MacOS. There are some hard coded platform specified parameters. I suggest to use vllm on Linux and even WSL2, since it is faster and more mature for production usages.
We are adding a new feature for easier tweak on parameters. Maybe you can help us to find out a pair of configuration to run vllm models on a 4G GPU after released.
Describe the bug
Hi!
I tried to run an llm locally using
openllm
, andphi3:3.8b-ggml-q4
happens to be the only model which I am able to run locally according to openllm, so I ranopenllm run phi3:3.8b-ggml-q4
, which failed (logs are attached).The failure happens during a cmake build process, as it is unable to find
FOUNDATION_LIBRARY
(line 116 in the logs). I looked into what this library provides and ended up in ggerganov/llama.cppHowever I assume
GGML_METAL
should not be set, as I'm running on a Linux (well, Windows10 + WSL2) system and an nvidia gpu. My attempts at disabling the Metal build were not successful, command failed with the same error as before.On the same machine, but on Windows,
openllm run phi3:3.8b-ggml-q4
fails on the same cmake build with the same error. Curiously,openllm hello
does not recognize this model as locally runnable as it did on WSL, but I did not investigate whyPlease let me know if I should raise this issue in ggerganov/llama.cpp instead
Thanks!
To reproduce
openllm run phi3:3.8b-ggml-q4
Logs
gist
Environment
bentoml env:
transformers-cli env:
openllm -v:
System information (Optional)
memory: 32GB
platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
architecture: x86-64
cpu: intel core i7-11850H @ 2.50GHz
gpu: NVIDIA RTX A2000 Laptop GPU 4GB
The text was updated successfully, but these errors were encountered: