Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with mixed-precision FMS (in GEOS) #1621

Open
mathomp4 opened this issue Dec 5, 2024 · 14 comments · May be fixed by #1625
Open

Help with mixed-precision FMS (in GEOS) #1621

mathomp4 opened this issue Dec 5, 2024 · 14 comments · May be fixed by #1625
Labels
bug Issue/PR that reports or fixes a given discovered bug question

Comments

@mathomp4
Copy link
Contributor

mathomp4 commented Dec 5, 2024

In GEOS, we've always done rather crazy things to support our use of FMS because we build our FV3 core at r4 but MOM6 requires r8. So we've built both fms_r4 and fms_r8 and then done CMake include and link shenanigans to get things working.

Since FMS has been working on mixed-precision support, I decided to try it out with GEOS. And I hit a build issue but I'm not sure if it's "You built FMS wrong" or "An interface is missing". I'm suspecting the latter, but it can easily be the former.1

First, I built FMS using (less the netcdf bits):

-D32BIT=ON -DFPIC=ON -DCONSTANTS=GEOS -DUSE_DEPRECATED_IO=ON

and that gave me a libfms_r4 library. This seemed to be what I want to do, right?

If so, when I go to build GEOS I get:

[  0%] Building Fortran object src/Components/@GEOSgcm_GridComp/GEOSogcm_GridComp/@GEOS_OceanGridComp/MOM6_GEOSPlug/mom6_cmake/CMakeFiles/mom6.dir/__/@mom6/config_src/infra/FMS2/MOM_diag_manager_infra.F90.o
/Users/mathomp4/Models/GEOSgcm-v12-FMSMixedPrec/src/Components/@GEOSgcm_GridComp/GEOSogcm_GridComp/@GEOS_OceanGridComp/MOM6_GEOSPlug/@mom6/config_src/infra/FMS2/MOM_diag_manager_infra.F90:444:71:

  444 |   call FMS_diag_field_add_attribute(diag_field_id, att_name, att_value)
      |                                                                       1
Error: There is no specific subroutine for the generic 'diag_field_add_attribute' at (1)
/Users/mathomp4/Models/GEOSgcm-v12-FMSMixedPrec/src/Components/@GEOSgcm_GridComp/GEOSogcm_GridComp/@GEOS_OceanGridComp/MOM6_GEOSPlug/@mom6/config_src/infra/FMS2/MOM_diag_manager_infra.F90:414:71:

  414 |   call FMS_diag_field_add_attribute(diag_field_id, att_name, att_value)
      |                                                                       1
Error: There is no specific subroutine for the generic 'diag_field_add_attribute' at (1)

Now, since I compile MOM6 as r8, then everything going into that routine would be r8 as well and from my reading of that routine in FMS:

!> @brief Add a attribute to the output field
!> @ingroup diag_manager_mod
INTERFACE diag_field_add_attribute
MODULE PROCEDURE diag_field_add_attribute_scalar_r
MODULE PROCEDURE diag_field_add_attribute_scalar_i
MODULE PROCEDURE diag_field_add_attribute_scalar_c
MODULE PROCEDURE diag_field_add_attribute_r1d
MODULE PROCEDURE diag_field_add_attribute_i1d
END INTERFACE diag_field_add_attribute

is that it doesn't have r4 and r8 variants of the real routines. But again, I might be building wrong.

Any advice/criticism is greatly appreciated.

Footnotes

  1. It could also be "You are building MOM6 wrong" as well. I see they have an OVERLOAD_R8 in MOM_diag_manager_infra.F90, but since we are explicitly compiling at r8, that wouldn't be necessary. (I did try it and got an ambiguous interface as expected.)

@bensonr bensonr added the bug Issue/PR that reports or fixes a given discovered bug label Dec 5, 2024
@bensonr
Copy link
Contributor

bensonr commented Dec 5, 2024

@mathomp4 - It sure looks like a bug sinceatt_value is real-typed at compile time. @uramirez8707 - your thoughts?

@mathomp4
Copy link
Contributor Author

mathomp4 commented Dec 5, 2024

@bensonr I did some Fortran overloading (lots of things r became r4 and r8) and I got past the above. (I can share my changes if you like.)

I'm now at a new one:

[  0%] Building Fortran object src/Components/@GEOSgcm_GridComp/GEOSogcm_GridComp/@GEOS_OceanGridComp/MOM6_GEOSPlug/mom6_cmake/CMakeFiles/mom6.dir/__/@mom6/config_src/drivers/FMS_cap/MOM_surface_forcing_gfdl.F90.o
/Users/mathomp4/Models/GEOSgcm-v12-FMSMixedPrec/src/Components/@GEOSgcm_GridComp/GEOSogcm_GridComp/@GEOS_OceanGridComp/MOM6_GEOSPlug/@mom6/config_src/drivers/FMS_cap/MOM_surface_forcing_gfdl.F90:1356:93:

 1356 |                  "The latent heat of fusion.", units="J/kg", default=hlf, scale=US%J_kg_to_Q)
      |                                                                                             1
Error: There is no specific subroutine for the generic 'get_param' at (1)
/Users/mathomp4/Models/GEOSgcm-v12-FMSMixedPrec/src/Components/@GEOSgcm_GridComp/GEOSogcm_GridComp/@GEOS_OceanGridComp/MOM6_GEOSPlug/@mom6/config_src/drivers/FMS_cap/MOM_surface_forcing_gfdl.F90:1358:93:

 1358 |                  "The latent heat of fusion.", units="J/kg", default=hlv, scale=US%J_kg_to_Q)
      |                                                                                             1
Error: There is no specific subroutine for the generic 'get_param' at (1)

which is interesting because get_param seems to be pure MOM6 at first glance. I'm still compiling MOM6 as r8, so that hasn't changed. So because FMS was built as "only" r4 that's making an interface in MOM6 fail that as far as I can see...doesn't care about FMS!

I might mention @marshallward (as he's who I think of when I think MOM6) as maybe he might know why it's reacting like this...

@uramirez8707
Copy link
Contributor

@mathomp4 - It sure looks like a bug sinceatt_value is real-typed at compile time. @uramirez8707 - your thoughts?

yep this is a bug, I would work on getting a fix in. Thanks for letting us know!

@marshallward
Copy link
Member

hlf and hlv are defined in FMS constants_mod, so you're probably looking at another r4 being pushed into an r8 function.

@bensonr
Copy link
Contributor

bensonr commented Dec 5, 2024

I'd recommend compiling FMS with default reals defined to be 64-bit (kind=8) quantities. In that way, constants_mod will default to (kind=8) versions and constantsR4_mod will have the (kind=4) complements. You will need to update your dynamical core to choose the correct constants module for proper precision at compile time. The current top of the trunk contains ifdefs for the constants use statements.

@mathomp4
Copy link
Contributor Author

mathomp4 commented Dec 6, 2024

@bensonr Do you mean this PR: NOAA-GFDL/GFDL_atmos_cubed_sphere#360

As you know our FV3 has sort of diverged from the mothership as @wmputman makes changes. But if that PR is all that's needed, shouldn't be too hard to port over and test.

Though, of course, if you've made other changes to FV3 to support mixed precision we don't have, then it could be just as bad in that direction.

Time to experiment!

@bensonr
Copy link
Contributor

bensonr commented Dec 6, 2024

@mathomp4 - that will get you a good chunk of the way towards your goal, though there may be a few constants declarations already in place prior to the FMS mixed-precision work.

@marshallward
Copy link
Member

marshallward commented Dec 6, 2024

I reckon MOM6 should also be converting hlf and hlv (and any other FMS numbers from constants_mod) to its configured real kind. We shouldn't be making any assumptions about FMS constants_mod kinds.

It might help things to go more smoothly. It might also expose more serious r4/r8 issues. 🤷

@mathomp4
Copy link
Contributor Author

mathomp4 commented Dec 6, 2024

@bensonr Hooboy. I might have to try and go the other direction (using fms_r4).

Using fms_r8 I've now hit the changes due to fms2_io_mod (I'm guessing I need something a la NOAA-GFDL/GFDL_atmos_cubed_sphere#74).

That said, eventually we'll need to move, so I'll try and see if I can get those in. I'm guessing they are nicely separate from any changes @wmputman made (since they don't seem to affect physics much)

@bensonr
Copy link
Contributor

bensonr commented Dec 6, 2024

@mathomp4 - there might have also been some fixes beyond the initial fms2-io -> FV3 PR applied to address inconsistencies found as time went on

@mathomp4
Copy link
Contributor Author

mathomp4 commented Dec 9, 2024

I have an update. Choosing the path of "least resistance", I decided to do this:

  1. Create a constants8 a la constants4 where the difference is the latter has #define RKIND r8_kind instead (plus all the 4s become 8s, etc.)
  2. Edit MOM_constants.F90 to have:
#ifdef CONSTANTSR8
use constantsr8_mod, only : HLV, HLF
#else
use constants_mod,   only : HLV, HLF
#endif

where I now use my new constantsr8_mod. So it builds as MOM6 now sees those as r8.

Good news! Our non-coupled run is zero-diff → Huzzah! Bad news! Our coupled run with MOM6 is non-zero-diff → Confusing!

As far as I can tell, all MOM6 cares about (from FMS constants) are hlv and hlf as @marshallward said. And in my stock run (MOM6 linked to FMS::fms_r8) and my mixed run (MOM6 linked to FMS::fms_r4), they are both r8. My guess is maybe the math that is in the FMS constants file is now at a slightly different precision and the ends of numbers are different. If I look at the first MOM Date print, Salt and Temp are slightly different:

Stock:

MOM Date   2000/09/03 00:00:00      0: En 1.176715E-05, 
MaxCFL  0.00000, Mass 1.369314297917E+21, 
Salt  34.72025632093, Temp   3.68928543243
...

Mixed:

MOM Date   2000/09/03 00:00:00      0: En 1.176715E-05, 
MaxCFL  0.00000, Mass 1.369314297917E+21, 
Salt  34.72025632115, Temp   3.68928543134
...

I suppose the questions now are:

  1. How amenable @bensonr, et al, would be to making a constants8 a la constants4
  2. How amenable @marshallward, et al, would be to having an...odd ifdef in MOM_constants.F90

I mean, I suppose for the latter to be really complete you could do:

#if defined(CONSTANTSR8)
use constantsr8_mod, only : HLV, HLF
#elif defined(CONSTANTSR4)
use constantsr4_mod, only : HLV, HLF
#else
use constants_mod,   only : HLV, HLF
#endif

and then every weird case could be covered. (Though I think I remember Santha or someone telling me once the ocean almost needs r8 to work, so maybe no one runs MOM6 at r4?)

@uramirez8707 uramirez8707 linked a pull request Dec 9, 2024 that will close this issue
8 tasks
@marshallward
Copy link
Member

We have merged a PR which converts hlf and hlv from constants_mod to r8 precision (NOAA-GFDL/MOM6#768). If this doesn't work for you, then you could resolve this issue within your driver. hlf and hlv only appear in driver code.

If that's not an option, then you could explicitly set them in MOM_input with LATENT_HEAT_FUSION and LATENT_HEAT_VAPORIZATION parameters. They are not strictly MOM6 parameters, but nearly every driver includes them. If you're careful about your bits, then you should be able to preserve whatever r4 value that you need. (This is probably the easiest solution.)

If neither of those work for you, then you should attend next week's MOM6 meeting so that we can work out a solution.

I think I remember Santha or someone telling me once the ocean almost needs r8 to work, so maybe no one runs MOM6 at r4?

Yes, MOM6 only works with r8. There are too many residual calculations that require more than 7 digits of accuracy. There may be some way to do mixed precision, but no has as yet determined exactly which parts requires r8.

@mathomp4
Copy link
Contributor Author

mathomp4 commented Dec 9, 2024

@marshallward Oohh. Even better if I don't need to touch MOM6 code. I should not go near it save "Hey, you misspelled this word" 😄 .

So then, I will back off my MOM6 change and then try your new one. I also see that @uramirez8707 has a change as well for FMS in #1625 .

Maybe with your powers combined, I'm Captain Plan...I mean, I will need to change nothing. :)

@mathomp4
Copy link
Contributor Author

Update: If I use the changes from @marshallward in NOAA-GFDL/MOM6#768 with my changes to diag_manager.F90 it builds and runs. So that's good news. I'm waiting until #1625 is finalized before I test that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue/PR that reports or fixes a given discovered bug question
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants