Releases · ngxson/llama.cpp

28 Jan 15:02

d7d1ecc

b4573

docker: allow installing pip packages system-wide (#11437)

Signed-off-by: rare-magma <[email protected]>

Assets 23

28 Jan 10:37

github-actions

b4570

6e84b0a

b4570

SYCL : SOFTMAX F16 mask support and other fixes (#11261)

Implemented ggml_sycl_op_soft_max() F16 src1(mask) support for which a pragma deprecation warning was added during #5021.
To do this, had to decouple it from ggml_sycl_op_flatten which always considered src1 to be of fp32 type(many OP functions are dependent on it).

* SYCL: SOFTMAX F16 mask support and other fixes

* test-backend-ops: Add F16 mask test cases

Assets 23

28 Jan 09:16

github-actions

b4569

2b8525d

b4569

Handle missing model in CLI parameters for llama-run (#11399)

The HTTP client in llama-run only prints an error in case the download of
a resource failed. If the model name in the CLI parameter list is missing,
this causes the application to crash.
In order to prevent this, a check for the required model parameter has been
added and errors for resource downloads get propagated to the caller.

Signed-off-by: Michael Engel <[email protected]>

Assets 23

27 Jan 19:26

github-actions

b4568

a4417dd

b4568

Add new hf protocol for ollama (#11449)

https://huggingface.co/docs/hub/en/ollama

Signed-off-by: Eric Curtin <[email protected]>

Assets 23

27 Jan 14:43

github-actions

b4567

d6d24cd

b4567

AMD: parse the architecture as supplied by gcnArchName (#11244)

The value provided by minor doesn't include stepping for AMD, parse the value returned by gcnArchName instead to retrieve an accurate ID.

Assets 23

27 Jan 14:29

github-actions

b4566

a5203b4

b4566

llama : minor fixes for up llama load model speed (#11448)

* impl::load change map bpe_ranks to onordered map for reduce time of impl::load on 30%

* llama_model_loader::init_mapping - replace new llama_mmap to std::make_unique<llama_mmap> for clean code & reduce (/2) time of running init_mappings

* Update src/llama-vocab.cpp

---------

Co-authored-by: lexasub <[email protected]>
Co-authored-by: Diego Devesa <[email protected]>

Assets 23

27 Jan 11:45

github-actions

b4565

df984e0

b4565

llama: refactor llama_decode_impl (#11381)

Assets 23

27 Jan 08:28

github-actions

b4564

acd38ef

b4564

metal: Handle null returned from MTLCreateSystemDefaultDevice() (#11441)

This fixes segmentation fault error when running tests when no metal
devices are available (for example, when not linked with Core Graphics
framework or otherwise).

Assets 23

26 Jan 16:52

github-actions

b4560

19f6518

b4560

cmake: add ggml find package (#11369)

* Add initial ggml cmake package

* Add build numbers to ggml find-package

* Expand variables with GGML_ prefix

* Guard against adding to cache variable twice

* Add git to msys2 workflow

* Handle ggml-cpu-* variants

* Link ggml/ggml-base libraries to their targets

* Replace main-cmake-pkg with simple-cmake-pkg

* Interface features require c_std_90

* Fix typo

* Removed unnecessary bracket from status message

* Update examples/simple-cmake-pkg/README.md

Co-authored-by: Georgi Gerganov <[email protected]>

* Update examples/simple-cmake-pkg/README.md

Co-authored-by: Georgi Gerganov <[email protected]>

---------

Co-authored-by: Georgi Gerganov <[email protected]>

Assets 23

26 Jan 16:04

github-actions

b4559

1d8ee06

b4559

rpc: fix register position (#11424)

Signed-off-by: thxCode <[email protected]>

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ngxson/llama.cpp

b4573

b4570

b4569

b4568

b4567

b4566

b4565

b4564

b4560

b4559