Releases: ngxson/llama.cpp
Releases · ngxson/llama.cpp
b4424
llama : update llama_model API names (#11063) * llama : deprecate llama_free_model, add llama_model_free ggml-ci * llama : change `llama_load_model_from_file` -> `llama_model_load_from_file` ggml-ci
b4422
mmap : fix fileno macro clash (#11076) * mmap : fix fileno macro clash ggml-ci * cont ggml-ci
b4419
CUDA: add BF16 support (#11093) * CUDA: add BF16 support
b4418
Vulkan: Add device-specific blacklist for coopmat for the AMD proprie…
b4417
llama : Add support for DeepSeek V3 (#11049) * convert : extend DEEPSEEK2 model architecture to support DeepseekV3ForCausalLM by adding EXPERT_WEIGHTS_NORM and EXPERT_GATING_FUNC model parameters and FFN_EXP_PROBS_B tensor type * vocab : add DeepSeek V3 pre-tokenizer regexes * unicode : handle ACCENT_MARK and SYMBOL categories in regex * llama : add DeepSeek V3 chat template, handle new model parameters and tensor types --------- Co-authored-by: Stanisław Szymczyk <[email protected]>
b4416
[GGML][RPC] Support for models with non-512-aligned tensors over RPC.…
b4415
llama : add support for the cohere2 model architecture (#10900)
b4414
sync : ggml
b4411
fix: Vulkan shader gen binary path (#11037)
b4410
common : disable KV cache shifting automatically for unsupported mode…