Skip to content

b3276

Compare
Choose a tag to compare
@github-actions github-actions released this 01 Jul 20:17
cb5fad4
CUDA: refactor and optimize IQ MMVQ (#8215)

* CUDA: refactor and optimize IQ MMVQ

* uint -> uint32_t

* __dp4a -> ggml_cuda_dp4a

* remove MIN_CC_DP4A checks

* change default

* try CI fix