Releases · ngxson/llama.cpp

06 Jan 09:45

47182dd

b4424

llama : update llama_model API names (#11063)

* llama : deprecate llama_free_model, add llama_model_free

ggml-ci

* llama : change `llama_load_model_from_file` -> `llama_model_load_from_file`

ggml-ci

Assets 23

06 Jan 09:45

github-actions

b4422

ae2f606

b4422

mmap : fix fileno macro clash (#11076)

* mmap : fix fileno macro clash

ggml-ci

* cont

ggml-ci

Assets 23

06 Jan 02:17

github-actions

b4419

46e3556

b4419

CUDA: add BF16 support (#11093)

* CUDA: add BF16 support

Assets 23

04 Jan 21:00

github-actions

b4418

b56f079

b4418

Vulkan: Add device-specific blacklist for coopmat for the AMD proprie…

Assets 23

04 Jan 20:53

github-actions

b4417

9394bbd

b4417

llama : Add support for DeepSeek V3 (#11049)

* convert : extend DEEPSEEK2 model architecture to support DeepseekV3ForCausalLM by adding EXPERT_WEIGHTS_NORM and EXPERT_GATING_FUNC model parameters and FFN_EXP_PROBS_B tensor type

* vocab : add DeepSeek V3 pre-tokenizer regexes

* unicode : handle ACCENT_MARK and SYMBOL categories in regex

* llama : add DeepSeek V3 chat template, handle new model parameters and tensor types

---------

Co-authored-by: Stanisław Szymczyk <[email protected]>

Assets 23

04 Jan 16:46

github-actions

b4416

f922a9c

b4416

[GGML][RPC] Support for models with non-512-aligned tensors over RPC.…

Assets 23

04 Jan 15:14

github-actions

b4415

46be942

b4415

llama : add support for the cohere2 model architecture (#10900)

Assets 23

04 Jan 14:47

github-actions

b4414

78c6785

b4414

sync : ggml

Assets 23

04 Jan 08:59

github-actions

b4411

c31fc8b

b4411

fix: Vulkan shader gen binary path (#11037)

Assets 23

03 Jan 12:53

github-actions

b4410

4b0c638

b4410

common : disable KV cache shifting automatically for unsupported mode…

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ngxson/llama.cpp

b4424

b4422

b4419

b4418

b4417

b4416

b4415

b4414

b4411

b4410