Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
-This update adds Apple Metal backend in VkFFT (VKFFT_BACKEND 5) -Metal backend has similar performance compared to other backends (tested on M1 Pro 8c SoC) -Metal backend passes all VkFFT tests OpenCL passes (tested on M1 Pro 8c SoC) -Current limitations of the Metal backend: no double precision, no saving/loading binaries, forced 256 max threads, C++ bindings only, incomplete error handling. -Bugfixes: Rader uint LUT offset not working in some cases, Mult Rader coalescing with <1024 threads, DCT-III reordering index issues with OpenCL on Intel/Apple GPUs. -Slightly improved coalescing logic for Nvidia GPUs -Added precision plots
- Loading branch information