Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

CUDA: fix typo in FlashAttention code ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13926 opened May 30, 2025 by JohannesGaessler Loading…
kv-cache : split implementation in separate sources
#13920 opened May 30, 2025 by ggerganov Loading…
[CANN]Support Acl Graph ggml changes relating to the ggml tensor library for machine learning
#13915 opened May 30, 2025 by noemotiovon Draft
[Ascend NPU] Enable labeler devops improvements to build systems and github actions
#13914 opened May 30, 2025 by shink Loading…
remove WIP since PR has been merged documentation Improvements or additions to documentation
#13912 opened May 30, 2025 by pepijndevos Loading…
convert: add eagle2 draft arch python python script changes
#13908 opened May 30, 2025 by pockers21 Loading…
Hybrid recurrent cache
#13904 opened May 29, 2025 by gabe-l-hart Draft
1 task
ci(intel): venv for python & pip installation for intel docker devops improvements to build systems and github actions
#13898 opened May 29, 2025 by Thammachart Loading…
CUDA: add a prop in ggml_cuda_device_infor for distinguish iGPU or dGPU in cuda (#13856) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13895 opened May 29, 2025 by Yangxiaoz Loading…
Need to undefine "hz" on AIX examples
#13894 opened May 29, 2025 by mehendarkarprajwal Loading…
ggml-cpu : split arch-specific implementations ggml changes relating to the ggml tensor library for machine learning
#13892 opened May 29, 2025 by xctan Draft
[WIP] model: add new model minimax-text-01 python python script changes
#13889 opened May 29, 2025 by qscqesze Draft
musa: extract ggml_cuda_mul_mat_batched_cublas_gemm_batched_ex ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13887 opened May 29, 2025 by yeahdongcn Loading…
3 tasks done
sycl: Add reorder to Q6_K mmvq implementation ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#13885 opened May 29, 2025 by s-Nick Loading…
finetune.cpp command-line arg build Compilation issues examples ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs testing Everything test related
#13873 opened May 28, 2025 by graehl Loading…
docs : add "Quick start" section for new users documentation Improvements or additions to documentation
#13862 opened May 28, 2025 by ngxson Loading…
tests : add test-tokenizers-remote testing Everything test related
#13846 opened May 28, 2025 by CISC Loading…
llama : auto-batch examples server
#13845 opened May 28, 2025 by ggerganov Loading…
1 task
musa: enable fp16 mma (all) and cublas on qy2 ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#13842 opened May 28, 2025 by yeahdongcn Loading…
3 tasks done
OpenCL: Add concat, tsembd, upscale, tanh, pad and repeat ggml changes relating to the ggml tensor library for machine learning
#13840 opened May 28, 2025 by rmatif Loading…
llama : use n_swa + n_ubatch cells for SWA cache examples server
#13833 opened May 27, 2025 by ggerganov Loading…
3 tasks done
convert: add support for Japanese Bert model python python script changes
#13830 opened May 27, 2025 by huydt84 Loading…
ProTip! no:milestone will show everything without a milestone.