Skip to content

Releases: ngxson/llama.cpp

b4573

28 Jan 15:02
d7d1ecc
Compare
Choose a tag to compare
docker: allow installing pip packages system-wide (#11437)

Signed-off-by: rare-magma <[email protected]>

b4570

28 Jan 10:37
6e84b0a
Compare
Choose a tag to compare
SYCL : SOFTMAX F16 mask support and other fixes (#11261)

Implemented ggml_sycl_op_soft_max() F16 src1(mask) support for which a pragma deprecation warning was added during #5021.
To do this, had to decouple it from ggml_sycl_op_flatten which always considered src1 to be of fp32 type(many OP functions are dependent on it).

* SYCL: SOFTMAX F16 mask support and other fixes

* test-backend-ops: Add F16 mask test cases

b4569

28 Jan 09:16
2b8525d
Compare
Choose a tag to compare
Handle missing model in CLI parameters for llama-run (#11399)

The HTTP client in llama-run only prints an error in case the download of
a resource failed. If the model name in the CLI parameter list is missing,
this causes the application to crash.
In order to prevent this, a check for the required model parameter has been
added and errors for resource downloads get propagated to the caller.

Signed-off-by: Michael Engel <[email protected]>

b4568

27 Jan 19:26
a4417dd
Compare
Choose a tag to compare
Add new hf protocol for ollama (#11449)

https://huggingface.co/docs/hub/en/ollama

Signed-off-by: Eric Curtin <[email protected]>

b4567

27 Jan 14:43
d6d24cd
Compare
Choose a tag to compare
AMD: parse the architecture as supplied by gcnArchName (#11244)

The value provided by minor doesn't include stepping for AMD, parse the value returned by gcnArchName instead to retrieve an accurate ID.

b4566

27 Jan 14:29
a5203b4
Compare
Choose a tag to compare
llama : minor fixes for up llama load model speed (#11448)

* impl::load change map bpe_ranks to onordered map for reduce time of impl::load on 30%

* llama_model_loader::init_mapping - replace new llama_mmap to std::make_unique<llama_mmap> for clean code & reduce (/2) time of running init_mappings

* Update src/llama-vocab.cpp

---------

Co-authored-by: lexasub <[email protected]>
Co-authored-by: Diego Devesa <[email protected]>

b4565

27 Jan 11:45
df984e0
Compare
Choose a tag to compare
llama: refactor llama_decode_impl (#11381)

b4564

27 Jan 08:28
acd38ef
Compare
Choose a tag to compare
metal: Handle null returned from MTLCreateSystemDefaultDevice() (#11441)

This fixes segmentation fault error when running tests when no metal
devices are available (for example, when not linked with Core Graphics
framework or otherwise).

b4560

26 Jan 16:52
19f6518
Compare
Choose a tag to compare
cmake: add ggml find package (#11369)

* Add initial ggml cmake package

* Add build numbers to ggml find-package

* Expand variables with GGML_ prefix

* Guard against adding to cache variable twice

* Add git to msys2 workflow

* Handle ggml-cpu-* variants

* Link ggml/ggml-base libraries to their targets

* Replace main-cmake-pkg with simple-cmake-pkg

* Interface features require c_std_90

* Fix typo

* Removed unnecessary bracket from status message

* Update examples/simple-cmake-pkg/README.md

Co-authored-by: Georgi Gerganov <[email protected]>

* Update examples/simple-cmake-pkg/README.md

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b4559

26 Jan 16:04
1d8ee06
Compare
Choose a tag to compare
rpc: fix register position (#11424)

Signed-off-by: thxCode <[email protected]>