Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Worse performance than pyerfa on Windows #57

Open
dcherrie-dstl opened this issue Jun 3, 2021 · 8 comments
Open

Worse performance than pyerfa on Windows #57

dcherrie-dstl opened this issue Jun 3, 2021 · 8 comments

Comments

@dcherrie-dstl
Copy link

On windows I have found ~2-3X worse performance compared to pyerfa. Using cmakelists (liberfa/erfa#75) and a clang compiler I was able to substitute the dll resulting in comparative (slightly better) performance.

@helgee
Copy link
Member

helgee commented Jun 3, 2021

Which compiler flags did you use?

@giordano
Copy link
Member

giordano commented Jun 3, 2021

Looking to the logs of ERFA_jll (they are in all tarballs), the library was compiled with -g -O2. CMake configured with -DCMAKE_BUILD_TYPE=Release instead would typically use -O3, that can make quite some difference

@dcherrie-dstl
Copy link
Author

I used Visual Studio 2019 and set the target to 'x64-Clang-Release'. PSB a snippet of the ninja build output (build.ninja). Not sure if its helpful; I dont have much experience with build systems and compilation.

build src\CMakeFiles\erfa.dir\apcg13.c.obj: C_COMPILER__erfa_RelWithDebInfo ......\src\apcg13.c || cmake_object_order_depends_target_erfa
DEFINES = -DHAVE_CONFIG_H -Derfa_EXPORTS
FLAGS = -m64 -fdiagnostics-absolute-paths /DWIN32 /D_WINDOWS /W3 /MD /Zi /O2 /Ob1 /DNDEBUG
INCLUDES = -I. -I......\src
OBJECT_DIR = src\CMakeFiles\erfa.dir
OBJECT_FILE_DIR = src\CMakeFiles\erfa.dir
TARGET_COMPILE_PDB = src\CMakeFiles\erfa.dir
TARGET_PDB = src\erfa.pdb

@helgee
Copy link
Member

helgee commented Aug 10, 2021

Thanks for posting the build log @dcherrie-dstl. I can see that your binaries were compiled with -O2 as well. It seems that Clang produces more performant binaries than our GCC cross-compiler. Could this be the case @giordano? I guess there is not much we can do here?

Out of curiosity, if you use the DLL from Clang and run the tests (import Pkg; Pkg.test("ERFA")), do they pass?

@giordano
Copy link
Member

Yes, it's possible that Clang is doing a better job at building more performant binaries, but the factor 2-3x looks a bit surprising. Maybe GCC is being very conservative about what to optimise? I think that trying to build with -O3, for example by switching to the CMake build system, is worth a try

@helgee
Copy link
Member

helgee commented Aug 10, 2021

Unfortunately, the CMake build system has not been merged, yet. I will give it a try locally on macOS.

@giordano
Copy link
Member

Ugh, right, I missed that 😅

@helgee
Copy link
Member

helgee commented Aug 10, 2021

Running the tests on macOS is virtually identical with -O2 and -O3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants