You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was looking at the FFT-based implementation of RL-deconvolution in deconv_rl.py and noticed a few things.
the fft plan is pre-calculated but not actually passed to the fft functions, resulting in some overhead
for deconvolutions that will be performed repeatedly with the same PSF on the same size of data a lot of code will be run twice.
there is a lot of code duplication between the two functions that take np arrays and the ones that take openCL arrays.
I do not understand why this hflip = h[::-1, ::-1] is needed. I'm also not sure whether it is correct, I assume for the 3D case this would have to be hflip = h[::-1, ::-1, ::-1]. Maybe you can explain.
I have done some benchmarks comparing the rewritten code to the current implementation in gputools and to flowdec: VolkerH/Lattice_Lightsheet_Deskew_Deconv#21. Note that the iteration times are not purely deconvolution but also include IO and affine transforms. This adds plenty of overhead. Without this overhead the speed improvements are even more significant.
The text was updated successfully, but these errors were encountered:
Hi Martin,
I was looking at the FFT-based implementation of RL-deconvolution in
deconv_rl.py
and noticed a few things.hflip = h[::-1, ::-1]
is needed. I'm also not sure whether it is correct, I assume for the 3D case this would have to behflip = h[::-1, ::-1, ::-1]
. Maybe you can explain.To address the first two points I have rewritten your code to test this. The rewritten code is here: https://github.com/VolkerH/Lattice_Lightsheet_Deskew_Deconv/blob/benchmarking/lls_dd/deconv_gputools_rewrite.py
I wasn't sure whether and if so how you would like to integrate this approach of setting up the decon first in gputools, otherwise I would have edited it there and created a pull request.
I have done some benchmarks comparing the rewritten code to the current implementation in gputools and to flowdec: VolkerH/Lattice_Lightsheet_Deskew_Deconv#21. Note that the iteration times are not purely deconvolution but also include IO and affine transforms. This adds plenty of overhead. Without this overhead the speed improvements are even more significant.
The text was updated successfully, but these errors were encountered: