llama.cpp for Rpi 4 inference

slabstech · Apr 5, 2024 · 48fd1a6 · 48fd1a6
1 parent 0eecb41
commit 48fd1a6
Show file tree

Hide file tree

Showing 2 changed files with 57 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -12,6 +12,8 @@ Usage of LLM for Everyday use
         - Setup + Documentation at [docs/2024/agent-code.md](https://github.com/slabstech/llm-recipes/blob/main/docs/2024/agent-code.md) 
         - Code examples at [src/autogen](https://github.com/slabstech/llm-recipes/tree/main/src/autogen)
         - Output from examples at [docs/2024/agent-example-output.md](https://github.com/slabstech/llm-recipes/blob/main/docs/2024/agent-example-output.md)
+    - llama.cpp + Raspi 4
+        - [Docs](https://github.com/slabstech/llm-recipes/blob/main/docs/llama-cpp.md) for setup of Raspi 4 inference. 
 - v0
     - ChatUI  : ollama + open-webui + mistral-7B + docker
         - Setup + Documentation at [docs/ollama-open-webui.md](https://github.com/slabstech/llm-recipes/blob/main/docs/ollama-open-webui.md)

diff --git a/docs/llama-cpp.md b/docs/llama-cpp.md
@@ -0,0 +1,55 @@
+Raspi Module
+
+
+Installation steps 
+- sudo apt update && sudo apt install git
+- sudo apt-get install git-lfs
+- git lfs install
+- mkdir piRun
+- cd piRun
+- python -m venv env
+- source env/bin/activate
+
+- python3 -m pip install torch numpy sentencepiece
+- sudo apt install g++ build-essential
+
+- wget https://github.com/ggerganov/llama.cpp/archive/refs/heads/gg/phi-2.zip
+- unzip phi-2.zip 
+- rm phi-2.zip
+
+
+- cd llama.cpp-gg-phi-2/
+- make 
+
+- mkdir phi-2-gguf/
+- pip install -U huggingface_hub
+- huggingface-cli download TheBloke/phi-2-GGUF --local-dir phi-2-gguf/  
+
+./main -m phi-2-gguf/phi-2.Q4_K_M.gguf -p "Question: Write a python function to print the first n numbers in the fibonacci series"
+
+
+
+
+Alternate Steps for gguf model build from source
+
+
+- Download - https://huggingface.co/microsoft/phi-2
+- pip install -U huggingface_hub
+- huggingface-cli download microsoft/phi-2
+
+
+- python convert-hf-to-gguf.py phi-2
+
+./main -m phi-2/ggml-model-f16.gguf -p "Question: Write a python function to print the first n numbers in the fibonacci series"
+
+-- model Deployment
+
+./main -m models/phi-2.Q4_0.gguf -p "Question: Write a python function to print the first n numbers in the fibonacci series"
+
+
+Reference
+- https://www.dfrobot.com/blog-13498.html
+
+- Docker on Raspi - https://docs.docker.com/engine/install/debian/
+- https://huggingface.co/TheBloke/phi-2-GGUF
+- https://ubuntu.com/blog/deploying-open-language-models-on-ubuntu