Skip to content

Conversation

reeselevine
Copy link
Collaborator

This PR adds support for SOFT_MAX, and moves RMS_NORM to use a similar split-row approach for efficiency.

  • Also added new ggml_soft_max_ext_inplace function so that test-backend-ops can run inplace tests.

@github-actions github-actions bot added testing Everything test related ggml changes relating to the ggml tensor library for machine learning labels Sep 30, 2025
Co-authored-by: Georgi Gerganov <[email protected]>
@reeselevine reeselevine merged commit ef07a40 into ggml-org:master Oct 2, 2025
64 of 68 checks passed
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Oct 3, 2025
* origin/master: (124 commits)
metal : fix loop bound in ggml_mem_ranges (ggml-org#16412)
llama : fix shapes for bert/mpt q/k norm (ggml-org#16409)
ggml : fix graph reallocation with multiple chunks (ggml-org#16396)
Fix missing messages on sibling navigation (ggml-org#16408)
vulkan: Replace uses of maxMemoryAllocationSize and VK_WHOLE_SIZE (ggml-org#16354)
vulkan: Fix FA coopmat1 invalid array indexing (ggml-org#16365)
ci : change macos-13 to macos-15-intel (ggml-org#16401)
Capture model name only after first token (streaming) or completed request (ggml-org#16405)
vulkan: in flash attention, bounds check against nem1 (don't rely on GGML_KQ_MASK_PAD) (ggml-org#16316)
webui : Fix messages payload sent to chat completions (ggml-org#16402)
fix: track viewportHeight via window.innerHeight to avoid unwanted scrolling (ggml-org#16356)
test-barrier : do not use more threads than physically available (ggml-org#16389)
ggml webgpu: add support for soft_max, optimize rms_norm (ggml-org#16357)
model : Apertus model implementation (ggml-org#15852)
musa: update compile flags (ggml-org#16265)
ci : fix ubuntu-latest-cmake-rpc (disable ccache) (ggml-org#16388)
ci: update vulkan ci (ggml-org#16294)
ci : fix clean-up of old logs (ggml-org#16381)
SYCL: Update to oneAPI 2025.2 (ggml-org#16371)
HIP: add IMbackK to codeowner (ggml-org#16375)
...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants