Skip to content

Conversation

safranowith
Copy link
Contributor

This PR adds CUDA backend support for the following unary operators in ggml:

GGML_OP_FLOOR

GGML_OP_CEIL

GGML_OP_ROUND

GGML_OP_TRUNC

Changes:

Implemented the CUDA kernel logic in ggml-cuda/unary.cu.

Registered the new operators in the backend dispatch logic.

Extended the test suite to validate CUDA results against CPU for correctness (test-backend-ops).

Verified correctness across different tensor shapes and data types (f16, f32).

Notes:
The implementation has been tested with test-backend-ops -b cuda and passes all relevant tests.

Clean up unrelated changes from previous commit
…l-org#16613)

* SYCL: Add support for FLOOR,CEIL,ROUND and TRUNC unary operators

Clean up unrelated changes from previous commit

* Chore: remove empty lines and fix indentation

* Clean up: remove leftover blank lines and fix spacing

* chore: fix trailing whitespace and ensure final newline

* Cleanup: remove redundant declarations already defined in header

* Sync docs/ops.md with updated backend operation support

* docs: update ops.md after rebase

* docs: update ops.md - Vulkan supports SSM_CONV and SSM_SCAN
@safranowith safranowith marked this pull request as draft October 20, 2025 14:23
@github-actions github-actions bot added documentation Improvements or additions to documentation Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant