Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workarounds for MPICH #2381

Merged
merged 10 commits into from
Oct 24, 2023
Merged

Workarounds for MPICH #2381

merged 10 commits into from
Oct 24, 2023

Conversation

mathomp4
Copy link
Member

@mathomp4 mathomp4 commented Sep 26, 2023

Description

This PR tracks various workarounds needed for MPICH as it does not support an interface in use mpi (that it should, see pmodels/mpich#6691) and a bug in how it handles ierror (fixed in repo, see pmodels/mpich#6693 and pmodels/mpich#6694). The latter will probably be in MPICH 4.1.3, but that might be a while off.

Note that this also requires updates to pFlogger which has the same non-support (Goddard-Fortran-Ecosystem/pFlogger#100) which is in develop there and will be released soon.

These were all pointed out to me by @climbfuji. I'm efforting to see if I can build a working MPICH stack on discover. It's been...a challenge.

Related Issue

Motivation and Context

Let's MAPL (and hopefully GEOS) build with MPICH. And makes the life of @climbfuji better. (Though we might need to backport this stuff to a previous release cycle if it all works)

How Has This Been Tested?

Ran with Intel Fortran/Intel MPI on discover and zero-diff. 🎉

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Trivial change (affects only documentation or cleanup)

Checklist:

  • I have tested this change with a run of GEOSgcm (if non-trivial)
  • I have added one of the required labels (0 diff, 0 diff trivial, 0 diff structural, non 0-diff)
  • I have updated the CHANGELOG.md accordingly following the style of Keep a Changelog

@mathomp4 mathomp4 added the 0 Diff The changes in this pull request have verified to be zero-diff with the target branch. label Sep 26, 2023
@mathomp4 mathomp4 self-assigned this Sep 26, 2023
@mathomp4
Copy link
Member Author

Well, this seems to be safe. Zero-diff in testing.

@mathomp4 mathomp4 marked this pull request as ready for review September 26, 2023 17:32
@mathomp4 mathomp4 requested review from a team as code owners September 26, 2023 17:32
@mathomp4
Copy link
Member Author

@climbfuji When you all use MAPL + MPICH, are you using our Oserver? If not, this might be enough for you (assuming you don't use pflogger as well).

@mathomp4
Copy link
Member Author

mathomp4 commented Oct 4, 2023

Note to @GEOS-ESM/mapl-team: I tested this on discover and zero-diff.

@climbfuji
Copy link

@mathomp4 I was successful with backporting the workaround to 2.35.2 and 2.40.3. There are other issues with mpich4 with the ufs-weather-model that prevent us from testing if those backports are good or not. But I can share them with you if you wish.

@mathomp4
Copy link
Member Author

@mathomp4 I was successful with backporting the workaround to 2.35.2 and 2.40.3. There are other issues with mpich4 with the ufs-weather-model that prevent us from testing if those backports are good or not. But I can share them with you if you wish.

@climbfuji Are the new updates in MAPL? If so, please do pass them along so I can incorporate them here.

Also, would you like us to make new releases of MAPL 2.35 and 2.40 with the changes?

@climbfuji
Copy link

@mathomp4 I was successful with backporting the workaround to 2.35.2 and 2.40.3. There are other issues with mpich4 with the ufs-weather-model that prevent us from testing if those backports are good or not. But I can share them with you if you wish.

@climbfuji Are the new updates in MAPL? If so, please do pass them along so I can incorporate them here.

Also, would you like us to make new releases of MAPL 2.35 and 2.40 with the changes?

No, they are patches that we run on those two versions in spack (our fork). I can create the PRs for that so that you can look at it).

@mathomp4
Copy link
Member Author

No, they are patches that we run on those two versions in spack (our fork). I can create the PRs for that so that you can look at it).

Oh...okay. Well, it should be a pretty easy release if needed. But if this one is good, then we can work on getting out MAPL 2.42 for future work.

Copy link
Collaborator

@tclune tclune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More checks - makes cmake initial step slower. :-(

@mathomp4 mathomp4 merged commit 7434997 into develop Oct 24, 2023
8 checks passed
@mathomp4 mathomp4 deleted the feature/mathomp4/mpich-workarounds branch October 24, 2023 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 Diff The changes in this pull request have verified to be zero-diff with the target branch.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants