Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Line support for saturating sub #385

Open
torsteingrindvik opened this issue Dec 19, 2024 · 4 comments
Open

Line support for saturating sub #385

torsteingrindvik opened this issue Dec 19, 2024 · 4 comments

Comments

@torsteingrindvik
Copy link
Contributor

I have a kernel operating on lines where I'd like to be able to do a saturating sub.

Support for this would be great.

In my particular case I would:

  • Do a saturating sub on u16 operands
  • Cast line to f16

Without saturating sub support I think I will need to

  • Cast line to f16
  • Sub
  • Clamp

Which I think is likely to be a bit slower.

@torsteingrindvik
Copy link
Contributor Author

I realize this isn't supported by non-lined primitives as well atm.

@wingertge
Copy link
Contributor

There's unfortunately no real support in the backends either, at least for SPIR-V and wgsl. SPIR-V only supports saturating operations for CMMA, not normal arithmetic.

@torsteingrindvik
Copy link
Contributor Author

There's unfortunately no real support in the backends either, at least for SPIR-V and wgsl. SPIR-V only supports saturating operations for CMMA, not normal arithmetic.

Any thoughts around having frontend APIs for things such as this anyway?
Since Rust std has saturating sub it's nice to be able to write it in kernels even though it might map to more than a single line of e.g. CUDA source code.

@wingertge
Copy link
Contributor

Any thoughts around having frontend APIs for things such as this anyway? Since Rust std has saturating sub it's nice to be able to write it in kernels even though it might map to more than a single line of e.g. CUDA source code.

I don't know if that's the right approach because there's more than one way to get saturating arithmetic. You can cast to float, depending on the type it might be better to do a wide multiplication and then clamp with integer math (but this wouldn't be possible on WebGPU because it only supports 32 bit int), for unsigned you can save yourself the lower bound clamp, etc. So I don't know if the frontend should make a decision for you for what the optimal way to handle it is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants