Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX ? #24

Open
gjaegy opened this issue May 13, 2019 · 9 comments
Open

AVX ? #24

gjaegy opened this issue May 13, 2019 · 9 comments

Comments

@gjaegy
Copy link

gjaegy commented May 13, 2019

Hi,

considering the great improvement of the SIMD variant, what about an AVX version that would process the lines 8 by 8 ? Do you see that as doable ?

Actually, I've just compared KissFFT/SIMD against the AVX version of MuFFT on a 1024x1024 2D grid, both take more or less the same amount of time at the moment. Hence my feelding that an AVX version of KissFFT would outperform MuFFT AVX implementation (or even AVX-512)...

@mborgerding
Copy link
Owner

Good idea!

@sunzhuoshi
Copy link

@mborgerding How about support SIMD via ISPC(https://github.com/ispc/ispc)? ISPC is an open source compiler supported by Intel, which can generate SSE, AVX or neon code with only one copy of code.

@gjaegy
Copy link
Author

gjaegy commented Jul 19, 2019

I would second that option, ISPC is great and we use it internally for some SIMD-friendly parts of our code. It allows you to write the code once, and it will scale to the host CPU automatically (no need to rewrite the code for each new instructions set to be supported).
Ideally you would have both existing SSE and ISPC variants (to give users the option to ignore ISPC completely).

@mborgerding
Copy link
Owner

That's the first I've heard of ISPC. The first glance looks promising!

@JulienMaille
Copy link
Contributor

I've been giving a look at the existing forks of kissfft on Github and found this commit implementing AVX
dornerworks@63cc168

@sunzhuoshi
Copy link

That's the first I've heard of ISPC. The first glance looks promising!
If you'd like to support ISPC, I may help. I'm doing some experiments with kissfft right now.
By the way, I'm an engineer from Intel, through not ISPC team.

@sunzhuoshi
Copy link

@mborgerding I've added initial support of ISPC(not optimized, just ported some key functions to ISPC), and got some interesting benchmark data on my Mac Book Pro 2017:

======timing test (type=double)
KISS nfft=1800, numffts=10000
COMMAND MAJFLT MINFLT RSS PAGEIN VSZ
./bm_kiss_double - - 900 0 4268964
cputime=0.210
fftw nfft=1800 numffts=10000
COMMAND MAJFLT MINFLT RSS PAGEIN VSZ
./bm_fftw_double - - 1544 0 4268924
cputime=0.220

@mborgerding
Copy link
Owner

Interesting indeed! Can you push your fork?

@sunzhuoshi
Copy link

Interesting indeed! Can you push your fork?

Sure, let me clean my code first.

Sorry for the upper data, I double checked the non-ISPC version benchmark data, it should not be contribution by ISPC. Let me try and see any ISPC optimization helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants