Releases: ngxson/llama.cpp
Releases · ngxson/llama.cpp
b2251
server : add KV cache quantization options (#5684)
b2239
minor : fix trailing whitespace (#5638)
b2220
server : support llava 1.6 (#5553) * server: init working 1.6 * move clip_image to header * remove commented code * remove c++ style from header * remove todo * expose llava_image_embed_make_with_clip_img * fix zig build
b2203
examples : support minItems/maxItems in JSON grammar converter (#5039) * support minLength and maxLength in JSON schema grammar converter * Update examples/json-schema-to-grammar.py --------- Co-authored-by: Georgi Gerganov <[email protected]>
b2168
gitignore : update for CLion IDE (#5544)
b2104
llama : do not print "offloading layers" message in CPU-only builds (…
b2038
llama : support InternLM2 (#5184) * support InternLM2 inference * add add_space_prefix KV pair
b2034
metal : add im2col F32 dst support (#5132)
b2026
Revert "server : change deps.sh xxd files to string literals (#5221)" This reverts commit 4003be0e5feef320f3707786f22722b73cff9356.
b2002
py : improve BPE tokenizer support (#5189)