Skip to content

Conversation

@tamarPal
Copy link
Contributor

@tamarPal tamarPal commented Oct 19, 2025

Summary

Implements the ROLL operator for the SYCL backend, enabling multi-dimensional tensor rolling on SYCL devices (Intel GPUs). Provides efficient circular shifts along tensor axes. The changes are focused and follow SYCL patterns.

Changes

  • Added ROLL kernel in roll.cpp with multi-axis support
  • ROLL dispatch in ggml-sycl.cpp
  • Integrated into the SYCL compute pipeline

Implementation

  • 4D tensor support: per-axis shifts with normalization
  • Parallel processing using range<3> following existing SYCL patterns
  • Direct GPU memory access for optimal performance
  • Zero-shift optimization: fast memcpy path

Testing

  • ROLL operations verified against CPU reference
  • Multi-axis scenarios tested with various shift combinations
  • In-place safety validated

Performance

  • GPU memory-optimized kernel design
  • Efficient modular arithmetic for shift normalization
  • Supports tensors up to 4D with configurable axis shifts
  • Zero-overhead for identity operations (all shifts = 0)

Compatibility

  • F32 tensors, OpenCL and Level Zero backends
  • Follows existing SYCL backend conventions

tamarPal added 2 commits October 19, 2025 15:13
- Implement ggml_sycl_roll function for F32 tensors
- Add multi-axis roll operation with SYCL kernel
- Support all 4 tensor dimensions with proper shift normalization
- Add roll.cpp and roll.hpp to SYCL backend
- Update backend dispatch and supports_op for GGML_OP_ROLL
- Tests: 17662/17662 pass with identical CPU reference results
@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Oct 19, 2025
@tamarPal
Copy link
Contributor Author

@ggerganov @NeoZhangJianyu,
this PR adds SYCL backend support for GGML_OP_ROLL.
All tests pass locally and the implementation currently supports F32 tensors.

tamarPal added 2 commits October 21, 2025 14:49
- Fix EditorConfig violations in ggml/src/ggml-sycl/roll.cpp
- Remove trailing spaces from lines 6, 11, 28, 47, 58, 60
@tamarPal
Copy link
Contributor Author

Hi @CISC @NeoZhangJianyu
I’ve addressed the requested changes and fixed the formatting issues.
Previously, two tests had failed, but they were unrelated to my code.
The PR is now ready, and I’d really appreciate it if you could approve the workflow run and review it for merge.
Thanks a lot!

Copy link
Collaborator

@NeoZhangJianyu NeoZhangJianyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good job!

Thank you!

@tamarPal
Copy link
Contributor Author

tamarPal commented Oct 23, 2025

Hi @NeoZhangJianyu @CISC!
I’ve fixed the .editorconfig issue in roll.hpp.
All checks should pass now — please let me know if anything else is needed.
Thanks again for the review and approval!

@CISC
Copy link
Collaborator

CISC commented Oct 25, 2025

The last commit changed the whole implementation and warrants a re-review.

@NeoZhangJianyu NeoZhangJianyu merged commit 2b9bd9b into ggml-org:master Oct 27, 2025
72 checks passed
pwilkin pushed a commit to pwilkin/llama.cpp that referenced this pull request Oct 27, 2025
* sycl: add ROLL operation support

- Implement ggml_sycl_roll function for F32 tensors
- Add multi-axis roll operation with SYCL kernel
- Support all 4 tensor dimensions with proper shift normalization
- Add roll.cpp and roll.hpp to SYCL backend
- Update backend dispatch and supports_op for GGML_OP_ROLL
- Tests: 17662/17662 pass with identical CPU reference results

* fix: remove trailing whitespace from roll.cpp

- Fix EditorConfig violations in ggml/src/ggml-sycl/roll.cpp
- Remove trailing spaces from lines 6, 11, 28, 47, 58, 60

* ci: retrigger

* sycl: remove wait() calls from ROLL operation

* fix: editorconfig — LF endings + final newline for roll.hpp

---------

Co-authored-by: tamarPal <[email protected]>
theo77186 pushed a commit to theo77186/llama.cpp that referenced this pull request Oct 28, 2025
* sycl: add ROLL operation support

- Implement ggml_sycl_roll function for F32 tensors
- Add multi-axis roll operation with SYCL kernel
- Support all 4 tensor dimensions with proper shift normalization
- Add roll.cpp and roll.hpp to SYCL backend
- Update backend dispatch and supports_op for GGML_OP_ROLL
- Tests: 17662/17662 pass with identical CPU reference results

* fix: remove trailing whitespace from roll.cpp

- Fix EditorConfig violations in ggml/src/ggml-sycl/roll.cpp
- Remove trailing spaces from lines 6, 11, 28, 47, 58, 60

* ci: retrigger

* sycl: remove wait() calls from ROLL operation

* fix: editorconfig — LF endings + final newline for roll.hpp

---------

Co-authored-by: tamarPal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants