Releases · dataelement/llama.cpp

18 Jun 05:03

a94e6ff

b3173 Latest

Latest

update: support Qwen2-57B-A14B (#7835)

* update: convert-hf-to-gguf.py to support Qwen2-57B-A14B

* fix: QWEN2MOE support for expert_feed_forward_length

previously, expert ff was taken from n_ff (intermediate size) but it is now properly taken from LLM_KV_EXPERT_FEED_FORWARD_LENGTH

n_ff_exp and n_ff_shared_exp are now properly calculated

* update: convert-hf-to-gguf.py cleanup for Qwen2MoeForCausalLM

* fix: QWEN2MOE support for expert_feed_forward_length

previously, expert ff was taken from n_ff (intermediate size) but it is now properly taken from LLM_KV_EXPERT_FEED_FORWARD_LENGTH

n_ff_exp and n_ff_shexp are now properly calculated

Assets 20

cudart-llama-bin-win-cu11.7.1-x64.zip

293 MB 2024-06-18T05:03:05Z
cudart-llama-bin-win-cu12.2.0-x64.zip

413 MB 2024-06-18T05:03:14Z
llama-b3173-bin-macos-arm64.zip

43.4 MB 2024-06-18T05:03:27Z
llama-b3173-bin-macos-x64.zip

45.8 MB 2024-06-18T05:03:29Z
llama-b3173-bin-ubuntu-x64.zip

48.2 MB 2024-06-18T05:03:31Z
llama-b3173-bin-win-avx-x64.zip

7.59 MB 2024-06-18T05:03:32Z
llama-b3173-bin-win-avx2-x64.zip

7.57 MB 2024-06-18T05:03:33Z
llama-b3173-bin-win-avx512-x64.zip

7.57 MB 2024-06-18T05:03:34Z
llama-b3173-bin-win-cuda-cu11.7.1-x64.zip

136 MB 2024-06-18T05:03:36Z
llama-b3173-bin-win-cuda-cu12.2.0-x64.zip

132 MB 2024-06-18T05:03:41Z
Source code (zip)

2024-06-17T19:08:46Z
Source code (tar.gz)

2024-06-17T19:08:46Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: dataelement/llama.cpp

b3173