Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

b3569 #288

Merged
merged 12 commits into from
Aug 11, 2024
Merged

b3569 #288

merged 12 commits into from
Aug 11, 2024

Conversation

Nexesenex
Copy link
Owner

No description provided.

ggerganov and others added 12 commits August 9, 2024 18:23
* llama : better replace_all (cont)

ggml-ci

* code : deduplicate replace_all

ggml-ci
* gguf-py : add T5ENCODER model architecture

* common : call llama_decode() during warmup only if the model has decoder

* convert-hf : add T5EncoderModel

* llama : add llama_model_has_decoder() API function

* llama : split build_t5() into build_t5_encoder() and build_t5_decoder()

* llama : add support for LLM_ARCH_T5ENCODER

* llama-embedding : add support for LLAMA_POOLING_TYPE_NONE

* llama-embedding : add support for encoder-only models

---------

Co-authored-by: Stanisław Szymczyk <[email protected]>
* default n_swa for phi-3

* fix

* double check swa
…ronization overhead. (#8943)

* Optimize Vulkan backend for better CPU performance and less GPU synchronization overhead.

- Allocation overhead for the temporary std::vectors was easily detectable with a sampling profiler and simple to remove.
- ggml_vk_sync_buffer introduce a full pipeline sync which has a significant cost on the GPU side, sometimes larger than the actual kernel execution. Adding only barriers for shader read/writes and transfers seems to be sufficient looking at the code which either launches compute kernels or copies tensors.

* Fix small typo

---------

Co-authored-by: 0cc4m <[email protected]>
Co-authored-by: Neo Zhang <>
@github-actions github-actions bot added documentation Improvements or additions to documentation examples python ggml SYCL Vulkan labels Aug 11, 2024
@Nexesenex Nexesenex merged commit 6131e6a into Nexesenex:spacestream Aug 11, 2024
46 of 59 checks passed
@Nexesenex Nexesenex changed the title b3568 b3569 Aug 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation examples ggml python SYCL Vulkan
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants