1
- ## Instructions
2
- Download model files from huggingface to "MiniCPM-Llama3-V-2_5" folder.
1
+ ## MiniCPM-Llama3-V 2.5
3
2
4
- Clone code
3
+ ### Usage
4
+
5
+ Download [ MiniCPM-Llama3-V-2_5] ( https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5 ) PyTorch model from huggingface to "MiniCPM-Llama3-V-2_5" folder.
6
+
7
+ Clone llama.cpp and checkout to branch ` minicpm-v2.5 ` :
5
8
``` bash
6
9
git clone -b minicpm-v2.5 https://github.com/OpenBMB/llama.cpp.git
7
10
cd llama.cpp
8
11
```
9
12
10
- Prepare the model
13
+ Convert PyTorch model to gguf files (You can also download the converted [ gguf ] ( https://huggingface.co/openbmb/MiniCPM-Llama3-V-2_5-gguf ) by us)
11
14
12
15
``` bash
13
16
python ./examples/minicpmv/minicpmv-surgery.py -m ../MiniCPM-Llama3-V-2_5
@@ -17,15 +20,20 @@ python ./convert.py ../MiniCPM-Llama3-V-2_5/model --outtype f16 --vocab-type bp
17
20
# quantize int4 version
18
21
./quantize ../MiniCPM-Llama3-V-2_5/model/model-8B-F16.gguf ../MiniCPM-Llama3-V-2_5/model/ggml-model-Q4_K_M.gguf Q4_K_M
19
22
```
20
- Try to inference
23
+
24
+ Build for Linux or Mac
25
+
21
26
``` bash
22
27
make
23
28
make minicpmv-cli
29
+ ```
24
30
25
- # run quantize f16 version
31
+ Inference on Linux or Mac
32
+ ```
33
+ # run f16 version
26
34
./minicpmv-cli -m ../MiniCPM-Llama3-V-2_5/model/model-8B-F16.gguf --mmproj ../MiniCPM-Llama3-V-2_5/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -p "What is in the image?"
27
35
28
- # run quantize int4 version
36
+ # run quantized int4 version
29
37
./minicpmv-cli -m ../MiniCPM-Llama3-V-2_5/model/ggml-model-Q4_K_M.gguf --mmproj ../MiniCPM-Llama3-V-2_5/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -p "What is in the image?"
30
38
31
39
# or run in interactive mode
@@ -34,8 +42,12 @@ make minicpmv-cli
34
42
35
43
### Android
36
44
37
- #### Build for Android using Termux
38
- [ Termux] ( https://github.com/termux/termux-app#installation ) is a method to execute ` llama.cpp ` on an Android device (no root required).
45
+ #### Build on Android device using Termux
46
+ We found that build on Android device would bring better runtime performance, so we recommend to build on device.
47
+
48
+ [ Termux] ( https://github.com/termux/termux-app#installation ) is a terminal app on Android device (no root required).
49
+
50
+ Install tools in Termux:
39
51
```
40
52
apt update && apt upgrade -y
41
53
apt install git make cmake
@@ -82,4 +94,4 @@ Now, you can start chatting:
82
94
```
83
95
$cd /data/data/com.termux/files/home/bin
84
96
$./minicpmv-cli -m ../model/ggml-model-Q4_K_M.gguf --mmproj ../model/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -p "What is in the image?"
85
- ```
97
+ ```
0 commit comments