Releases · ngxson/llama.cpp

23 Feb 20:42

fd43d66

b2251

server : add KV cache quantization options (#5684)

Assets 14

22 Feb 15:43

github-actions

b2239

3a03541

b2239

minor : fix trailing whitespace (#5638)

Assets 14

21 Feb 09:47

github-actions

b2220

6560bed

b2220

server : support llava 1.6 (#5553)

* server: init working 1.6

* move clip_image to header

* remove commented code

* remove c++ style from header

* remove todo

* expose llava_image_embed_make_with_clip_img

* fix zig build

Assets 14

19 Feb 15:09

github-actions

b2203

9d679f0

b2203

examples : support minItems/maxItems in JSON grammar converter (#5039)

* support minLength and maxLength in JSON schema grammar converter

* Update examples/json-schema-to-grammar.py

---------

Co-authored-by: Georgi Gerganov <[email protected]>

Assets 14

17 Feb 19:47

github-actions

b2168

d250c9d

b2168

gitignore : update for CLion IDE (#5544)

Assets 14

08 Feb 21:11

github-actions

b2104

41f308f

b2104

llama : do not print "offloading layers" message in CPU-only builds (…

Assets 14

01 Feb 14:45

github-actions

b2038

ce32060

b2038

llama : support InternLM2 (#5184)

* support InternLM2 inference
  * add add_space_prefix KV pair

Assets 14

31 Jan 14:58

github-actions

b2034

efb7bdb

b2034

metal : add im2col F32 dst support (#5132)

Assets 13

30 Jan 22:27

github-actions

b2026

e0085fd

b2026

Revert "server : change deps.sh xxd files to string literals (#5221)"

This reverts commit 4003be0e5feef320f3707786f22722b73cff9356.

Assets 13

29 Jan 10:11

github-actions

b2002

e76627b

b2002

py : improve BPE tokenizer support (#5189)

Assets 12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ngxson/llama.cpp

b2251

b2239

b2220

b2203

b2168

b2104

b2038

b2034

b2026

b2002