You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, due to my inexperience with templates, I have implemented the numpy-like interface functions using a number of (inline) functions that call a template instance, e.g.:
I'm not sure, but it seems likely that the compiler will simply compile all these functions when a user includes this header. This makes the resulting binary a lot larger than it needs to be and increases compile time.
I think I now know how to rewrite the _fft_ and _ifft_ templates in such a way that I can write using template aliases for the three families (fft, rfft and hfft) that only depend on input precision. This will involve the following modifications to _fft_ and _ifft_:
Putting the fftw plan creation functions (constexpr static) in templates with parameters for precision, dimensionality, "c2c vs r2c" and direction.
Using the fftw execute and destroy_plan functions that I put in the precision dependent fftw_t template as well (these are also constexpr static).
Putting the output type in -> notation, using decltype or declexpr to determine the type based on template parameters for input precision, "c2c vs r2c" and direction.
The latter point was the crucial knowledge gap that caused me to write the functions as I did, since I didn't get auto-type-deduction working for the output type.
The text was updated successfully, but these errors were encountered:
Suggestion by @LourensVeen: also rewrite them so that they can take in xexpressions or any other xtensor container that supports the right operators. Look at examples in the other xtensor libs.
Suggestion by @LourensVeen: also rewrite them so that they can take in xexpressions or any other xtensor container that supports the right operators. Look at examples in the other xtensor libs.
This would be preferable, as the library requires xarray etc. - I believe the philosophy of xtensor is the algorithms should be broadly ambivalent to the storage type used. In my case, I need to expose these to Python unfortunately, so the pytensor class is best as it avoids the need to copy.
I'm not sure, but it seems likely that the compiler will simply compile all these functions when a user includes this header.
FYI, the template itself is only compiled when instantiated. The inline functions will be parsed and checked, but should not be output except when they are actually used; the template will not be properly processed by the compiler until it's actually used with concrete arguments. If the binary is very large, it's possible the implementation is also getting inlined. I haven't looked into the details yet.
Right now, due to my inexperience with templates, I have implemented the numpy-like interface functions using a number of (inline) functions that call a template instance, e.g.:
I'm not sure, but it seems likely that the compiler will simply compile all these functions when a user includes this header. This makes the resulting binary a lot larger than it needs to be and increases compile time.
I think I now know how to rewrite the
_fft_
and_ifft_
templates in such a way that I can writeusing
template aliases for the three families (fft, rfft and hfft) that only depend on input precision. This will involve the following modifications to_fft_
and_ifft_
:constexpr static
) in templates with parameters for precision, dimensionality, "c2c vs r2c" and direction.fftw_t
template as well (these are alsoconstexpr static
).->
notation, usingdecltype
ordeclexpr
to determine the type based on template parameters for input precision, "c2c vs r2c" and direction.The latter point was the crucial knowledge gap that caused me to write the functions as I did, since I didn't get auto-type-deduction working for the output type.
The text was updated successfully, but these errors were encountered: