b2651 #107

Nexesenex · 2024-04-11T19:36:21Z

No description provided.

Key changes: * BERT conversion: fix abuse of LlamaHfVocab, do not set BOS or EOS * Nomic Embed conversion: pad vocab instead of slicing embedding tensor * llama_tokenize: handle added special tokens like HF does

* docs: how to add a model * docs: model: typo and docs * docs: model: add prevision on RoPE * docs: model: rephrasing README.md * docs: model: rephrasing README.md * docs: model: README.md fix trailing spaces * docs : some fixes * Update README.md --------- Co-authored-by: Georgi Gerganov <[email protected]>

* minor layout improvements * added missing file, run deps.sh locally

This commit adds an option to the gguf example to not check the tensor data. The motivation for this is that it can be nice to use the gguf tool to read other .gguf files that were not created by the gguf tool. Signed-off-by: Daniel Bevenius <[email protected]>

* gguf-debug: Example how to use ggml callback for debugging * gguf-debug: no mutex, verify type, fix stride. * llama: cv eval: move cb eval field in common gpt_params * ggml_debug: use common gpt_params to pass cb eval. Fix get tensor SIGV random. * ggml_debug: ci: add tests * ggml_debug: EOL in CMakeLists.txt * ggml_debug: Remove unused param n_batch, no batching here * ggml_debug: fix trailing spaces * ggml_debug: fix trailing spaces * common: fix cb_eval and user data not initialized * ci: build revert label * ggml_debug: add main test label * doc: add a model: add a link to ggml-debug * ggml-debug: add to make toolchain * ggml-debug: tests add the main label * ggml-debug: ci add test curl label * common: allow the warmup to be disabled in llama_init_from_gpt_params * ci: add curl test * ggml-debug: better tensor type support * gitignore : ggml-debug * ggml-debug: printing also the sum of each tensor * ggml-debug: remove block size * eval-callback: renamed from ggml-debug * eval-callback: fix make toolchain --------- Co-authored-by: slaren <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]>

* scripts : add --outdir option to hf.sh This commit adds an option to the hf.sh script that allows the user to specify an output directory for the downloaded file. The motivation for this changes is that examples that use the hf.sh script to download models from huggingface can now specify the output directory, perhaps to the `models` directory to keep them in one place and not clutter the root directory. Signed-off-by: Daniel Bevenius <[email protected]> * squash! scripts : add --outdir option to hf.sh Fix format of the --outdir option in the usage message. Signed-off-by: Daniel Bevenius <[email protected]> --------- Signed-off-by: Daniel Bevenius <[email protected]>

When action download-artifact was updated to v4, the default download path changed. This fix binaries not being uploaded to releases.

…/ reuses) (#6609) * grammars: reserve rejects & next candidates * grammars: reuse new_stacks * grammars: fix missing sig change in llama.h * grammars: fix test (api changed) * grammars: update gbnf-validator.cpp * grammars: simpler syntax (no swap)

* iq1_bn: faster Metal dot product 82 t/s -> 87.9 t/s * iq1_bn(Metal): 87.9 -> 89.0 t/s for TG-128 * iq1_bn(Metal): 89.0 -> 94.7 t/s for TG-128 So, total improvement is ~15%. Not bad. * iq1_bn(Metal): 686 -> 702 t/s for PP-512 * iq2_bn(Metal): 710 -> 714 t/s for PP-512 --------- Co-authored-by: Iwan Kawrakow <[email protected]>

ggerganov and others added 14 commits April 9, 2024 20:29

sync : ggml

c4a3a4f

BERT tokenizer fixes (#6498)

1b67731

Key changes: * BERT conversion: fix abuse of LlamaHfVocab, do not set BOS or EOS * Nomic Embed conversion: pad vocab instead of slicing embedding tensor * llama_tokenize: handle added special tokens like HF does

readme: fix typo in amdgpu target name (#6573)

ba5e134

readme : update UI list (#6560)

b231b37

readme : fix ROCm link (#6579)

29122d3

convert.py : add consolidated.safetensors for mixtral 8x22b (#6587)

65c64dc

llama : add model types for mixtral (#6589)

4f407a0

minor layout improvements (#6572)

b3a96f2

* minor layout improvements * added missing file, run deps.sh locally

ci: download artifacts to release directory (#6612)

1bbdaf6

When action download-artifact was updated to v4, the default download path changed. This fix binaries not being uploaded to releases.

Nexesenex merged commit a8dd6b3 into Nexesenex:sidestream Apr 11, 2024
15 of 26 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b2651 #107

b2651 #107

Nexesenex commented Apr 11, 2024

b2651 #107

b2651 #107

Conversation

Nexesenex commented Apr 11, 2024