-
Notifications
You must be signed in to change notification settings - Fork 12k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
CUDA: fix typo in FlashAttention code
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#13926
opened May 30, 2025 by
JohannesGaessler
Loading…
llama : deprecate explicit kv_self defrag/update calls
examples
#13921
opened May 30, 2025 by
ggerganov
Loading…
kv-cache : split implementation in separate sources
#13920
opened May 30, 2025 by
ggerganov
Loading…
mtmd : drop
_shared
from libmtmd
name, merge helpers into libmtmd (⚠️ breaking change)
examples
server
#13917
opened May 30, 2025 by
ngxson
Loading…
[CANN]Support Acl Graph
ggml
changes relating to the ggml tensor library for machine learning
#13915
opened May 30, 2025 by
noemotiovon
•
Draft
[Ascend NPU] Enable labeler
devops
improvements to build systems and github actions
#13914
opened May 30, 2025 by
shink
Loading…
remove WIP since PR has been merged
documentation
Improvements or additions to documentation
#13912
opened May 30, 2025 by
pepijndevos
Loading…
convert: add eagle2 draft arch
python
python script changes
#13908
opened May 30, 2025 by
pockers21
Loading…
ci(intel): venv for python & pip installation for intel docker
devops
improvements to build systems and github actions
#13898
opened May 29, 2025 by
Thammachart
Loading…
CUDA: add a prop in ggml_cuda_device_infor for distinguish iGPU or dGPU in cuda (#13856)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#13895
opened May 29, 2025 by
Yangxiaoz
Loading…
ggml-cpu : split arch-specific implementations
ggml
changes relating to the ggml tensor library for machine learning
musa: extract ggml_cuda_mul_mat_batched_cublas_gemm_batched_ex
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#13887
opened May 29, 2025 by
yeahdongcn
Loading…
3 tasks done
sycl: Add reorder to Q6_K mmvq implementation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#13885
opened May 29, 2025 by
s-Nick
Loading…
finetune.cpp command-line arg
build
Compilation issues
examples
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#13873
opened May 28, 2025 by
graehl
Loading…
docs : add "Quick start" section for new users
documentation
Improvements or additions to documentation
#13862
opened May 28, 2025 by
ngxson
Loading…
tests : add test-tokenizers-remote
testing
Everything test related
#13846
opened May 28, 2025 by
CISC
Loading…
musa: enable fp16 mma (all) and cublas on qy2
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#13842
opened May 28, 2025 by
yeahdongcn
Loading…
3 tasks done
OpenCL: Add concat, tsembd, upscale, tanh, pad and repeat
ggml
changes relating to the ggml tensor library for machine learning
#13840
opened May 28, 2025 by
rmatif
Loading…
kv-cache : avoid modifying recurrent cells when setting inputs
#13834
opened May 27, 2025 by
compilade
Loading…
llama : use n_swa + n_ubatch cells for SWA cache
examples
server
#13833
opened May 27, 2025 by
ggerganov
Loading…
3 tasks done
convert: add support for Japanese Bert model
python
python script changes
#13830
opened May 27, 2025 by
huydt84
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.