Fix missing MPI in custom-matrix-format example #1370

MarcelKoch · 2023-07-12T08:09:42Z

This PR fixes the missing MPI in the custom-matrix-format example when using the MPI wrapper as CXX compiler. There are actually two ways of fixing this implemented here:

remove the <ginkgo/ginkgo.hpp> include in the cuda file. Now the MPI headers are not required when compiling the cuda code anymore.
Use the CMake option `MPI_ASSUME_NO_BUILTIN_MPI to disable using MPI from the compiler wrapper directly.

Both of these fix the issue. I think having both of them is a good idea, because 1. the include is just not necessary, and 2. this would allow users to include <ginkgo/ginkgo.hpp> in device code.

This reverts commit 3905206.

Using MPI from the MPI compiler wrapper directly means that no include path or library is added for MPI. This gives errors if a file that is not compiled by the wrappers and is linked against/includes ginkgo can't find the MPI headers or implementation. This mostly effects device code, since they will most likely not be compiled with the MPI wrapper compilers.

MarcelKoch · 2023-07-12T08:10:34Z

Fixes part of #1367

yhmtsai · 2023-07-12T08:17:35Z

sorry, I do not get the usage of MPI_ASSUME_NO_BUILTIN_MPI.
could you elaborate it on two cases from gcc and mpicc?

upsj

LGTM! One minor nit (we already have way too many configuration options)

upsj · 2023-07-12T08:41:40Z

cmake/autodetect_executors.cmake

+    set(GINKGO_MPI_ASSUME_NO_BUILTIN_MPI ON CACHE BOOL "Disables using MPI wrapper compiler directly.")
+    mark_as_advanced(GIKNGO_MPI_ASSUME_NO_BUILTIN_MPI)


Do we need to make this controllable at all? It seems to be relevant only in very limited circumstances, where either mpiexec is missing from the environment, or the user has a MPI compiler wrapper loaded that is not the one specified by CXX_COMPILER

TBH, I'm not 100% sure how cmake finds MPI with this option on. Perhaps there are some weird cases, where it doesn't use the same MPI as the MPI wrapper would have. So I added the option to turn it off. I also marked it as advanced to indicate that users generally need not to worry about it.

MarcelKoch · 2023-07-12T09:26:46Z

@yhmtsai If you set CXX=mpicxx then compiling with it doesn't require adding include paths or libraries for MPI. That gets picked up by cmake, when it finds the MPI package. So, on the cmake level, linking to MPI doesn't require any additional flags. This will lead to errors, when something is compiled not using mpicxx, but still including/linking to MPI. The MPI_ASSUME_NO_BUILTIN_MPI flag tells cmake to somehow ignore the MPI compiler wrapper and add the needed include path and libraries regardless.

upsj · 2023-07-12T10:23:35Z

This whole setting has me a bit confused - theoretically, CMake should interrogate the compiler wrapper for the location of the mpi.h file and add the include flag to the MPI target, but that doesn't seem to be happening here? From FindMPI.cmake: # Given the new design of FindMPI, INCLUDE_DIRS will always be located, even under implicit linking.

MarcelKoch · 2023-07-13T08:38:46Z

@upsj The comment from FindMPI.cmake is not in the official documentation, so I don't think we can rely on that. I tried again (without MPI_ASSUME_NO_BUILTIN_MPI) with the latest cmake release (3.26.4) and they still not set the include directories.
So maybe that is a bug in cmake, or the comment is just not very clear.

upsj · 2023-07-13T08:47:59Z

@MarcelKoch yes, I'd really like to get a full trace of the execution, because this seems like a bug to me. For this to be fixed, we need ~~MPI_CUDA_INCLUDE_DIRS~~ MPI_CXX_COMPILER_INCLUDE_DIRS to be set, which seems to be missing.

MarcelKoch · 2023-07-13T12:51:56Z

@upsj I will open an cmake issue to see what's going on. From a brief look at the source, you're right, it always sets the include directories, but if a compiler wrapper is used, the value of the used variable is empty. So the source code comment is moot.

MarcelKoch · 2023-07-13T13:16:53Z

Here is the issue, let's see what happens: https://gitlab.kitware.com/cmake/cmake/-/issues/25085

upsj · 2023-07-17T07:17:29Z

I no longer think this fix is necessary, instead we should be propagating the C++ compiler as CUDA host compiler correctly.

there is a more general fix

MarcelKoch · 2023-07-21T14:58:35Z

Fixed by #1368 and #1205

MarcelKoch added 4 commits July 12, 2023 09:06

fix missing mpi in stencil example

3905206

Revert "fix missing mpi in stencil example"

1b67915

This reverts commit 3905206.

remove unnecessary ginkgo include

4081a11

MarcelKoch self-assigned this Jul 12, 2023

ginkgo-bot added reg:build This is related to the build system. reg:example This is related to the examples. labels Jul 12, 2023

MarcelKoch requested a review from a team July 12, 2023 08:10

MarcelKoch added the type:distributed-functionality label Jul 12, 2023

upsj previously approved these changes Jul 12, 2023

View reviewed changes

pratikvn approved these changes Jul 12, 2023

View reviewed changes

MarcelKoch closed this Jul 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix missing MPI in custom-matrix-format example #1370

Fix missing MPI in custom-matrix-format example #1370

MarcelKoch commented Jul 12, 2023

MarcelKoch commented Jul 12, 2023

yhmtsai commented Jul 12, 2023

upsj left a comment

upsj Jul 12, 2023

MarcelKoch Jul 12, 2023

MarcelKoch commented Jul 12, 2023

upsj commented Jul 12, 2023

MarcelKoch commented Jul 13, 2023

upsj commented Jul 13, 2023 •

edited

Loading

MarcelKoch commented Jul 13, 2023

MarcelKoch commented Jul 13, 2023

upsj commented Jul 17, 2023

MarcelKoch commented Jul 21, 2023

		set(GINKGO_MPI_ASSUME_NO_BUILTIN_MPI ON CACHE BOOL "Disables using MPI wrapper compiler directly.")
		mark_as_advanced(GIKNGO_MPI_ASSUME_NO_BUILTIN_MPI)

Fix missing MPI in custom-matrix-format example #1370

Fix missing MPI in custom-matrix-format example #1370

Conversation

MarcelKoch commented Jul 12, 2023

MarcelKoch commented Jul 12, 2023

yhmtsai commented Jul 12, 2023

upsj left a comment

Choose a reason for hiding this comment

upsj Jul 12, 2023

Choose a reason for hiding this comment

MarcelKoch Jul 12, 2023

Choose a reason for hiding this comment

MarcelKoch commented Jul 12, 2023

upsj commented Jul 12, 2023

MarcelKoch commented Jul 13, 2023

upsj commented Jul 13, 2023 • edited Loading

MarcelKoch commented Jul 13, 2023

MarcelKoch commented Jul 13, 2023

upsj commented Jul 17, 2023

MarcelKoch commented Jul 21, 2023

upsj commented Jul 13, 2023 •

edited

Loading