Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimensional scaling tests are producing chksum differences #275

Open
mnlevy1981 opened this issue Apr 12, 2024 · 3 comments
Open

Dimensional scaling tests are producing chksum differences #275

mnlevy1981 opened this issue Apr 12, 2024 · 3 comments

Comments

@mnlevy1981
Copy link
Collaborator

I got the MARBL branch to pass dimensional scaling tests, but in doing so I noticed that some of the scaling tests are producing chksum() differences for non-MARBL fields. My testing strategy was to run a baseline with DEBUG=True, and then run individual tests with DEBUG=True and one of the *_RESCALE_POWER=10. The ocean.stats files matched for all these runs, but some of cesm.log files reported differences in the log output:

  • T_RESCALE_POWER = 10:
@@ -606,8 +606,8 @@
 h-point: c=    194999 after KPP tv%frazil
 h-point: mean=   0.0000000000000000E+00 min=   0.0000000000000000E+00 max=   0.0000000000000000E+00 after KPP tv%salt_deficit
 h-point: c=         0 after KPP tv%salt_deficit
-h-point: mean=   4.4376542679371360E-05 min=  -6.4022985345054831E-03 max=   3.9322714284468409E-02 after KPP tv%TempxPmE
-h-point: c=   5041265 after KPP tv%TempxPmE
+h-point: mean=   4.5441579703676273E-02 min=  -6.5559536993336147E+00 max=   4.0266459427295651E+01 after KPP tv%TempxPmE
+h-point: c=   5071421 after KPP tv%TempxPmE
 h-point: mean=   6.6078014529134853E-04 min=   0.0000000000000000E+00 max=   3.0173242267400169E-01 after KPP Kd_heat
 h-point: c= 409916705 after KPP Kd_heat
 h-point: mean=   6.6058804378353185E-04 min=   0.0000000000000000E+00 max=   3.0173242267400169E-01 after KPP Kd_salt
  • L_RESCALE_POWER = 10
@@ -42499,9 +42499,9 @@
 h-point: mean=   2.4352442695468348E+04 min=   1.7138516846630746-143 max=   6.9861061588048920E+04 MEKE LmixScale
 h-point: c=   5490514 MEKE LmixScale
 h-point: mean=   3.4859618873247925E-12 min=  -2.8924527398703980E-07 max=   2.6980998412243935E-07 MEKE src
-h-point: c=   5031269 MEKE src
+h-point: c=   5031240 MEKE src
 h-point: mean=   1.1141024840786339E-02 min=   0.0000000000000000E+00 max=   4.5810243321258808E+00 MEKE post-update MEKE
-h-point: c=   5132447 MEKE post-update MEKE
+h-point: c=   5132449 MEKE post-update MEKE
 h-point: mean=   4.3433588469983327E+01 min=   2.3507772963993505E-04 max=   2.5543838125970015E+02 Pre-advection h
 h-point: c= 359916601 sw= 360000603 se= 360000603 nw= 359832599 ne= 359832599 Pre-advection h
 u-point: mean=  -6.2482021116265647E+07 min=  -1.7043868928096725E+10 max=   1.0955338455112007E+10 u Pre-advection uhtr
  • C_RESCALE_POWER = 10
@@ -8406,10 +8406,10 @@
 h-point: c= 393247078 Before tracer diffusion coccoFe
 h-point: mean=   6.3839077495849300E-03 min=   9.9958796553452880-101 max=   2.7936481413543470E+00 Before tracer diffusion coccoCaCO3
 h-point: c= 396183690 Before tracer diffusion coccoCaCO3
-h-point: mean=   5.9419608913172093E+00 min=  -2.1183601956327172E+00 max=   3.2145597947819027E+01 before HBD temp
-h-point: c= 306692919 before HBD temp
-h-point: mean=   5.9419608946613094E+00 min=  -2.1183601956327172E+00 max=   3.2145507047488380E+01 after HBD temp
-h-point: c= 306689751 after HBD temp
+h-point: mean=   5.8026961829269622E-03 min=  -2.0687111285475753E-03 max=   3.1392185495917019E-02 before HBD temp
+h-point: c= 333613467 before HBD temp
+h-point: mean=   5.8026961861926850E-03 min=  -2.0687111285475753E-03 max=   3.1392096726062871E-02 after HBD temp
+h-point: c= 333610299 after HBD temp
 h-point: mean=   2.8279382711228873E+01 min=   0.0000000000000000E+00 max=   4.0748130640758944E+01 before HBD salt
 h-point: c= 266237499 before HBD salt
 h-point: mean=   2.8279382712458503E+01 min=   0.0000000000000000E+00 max=   4.0748130640758944E+01 after HBD salt
  • S_RESCALE_POWER = 10
@@ -8410,10 +8410,10 @@
 h-point: c= 306692919 before HBD temp
 h-point: mean=   5.9419608946613094E+00 min=  -2.1183601956327172E+00 max=   3.2145507047488380E+01 after HBD temp
 h-point: c= 306689751 after HBD temp
-h-point: mean=   2.8279382711228873E+01 min=   0.0000000000000000E+00 max=   4.0748130640758944E+01 before HBD salt
-h-point: c= 266237499 before HBD salt
-h-point: mean=   2.8279382712458503E+01 min=   0.0000000000000000E+00 max=   4.0748130640758944E+01 after HBD salt
-h-point: c= 266239376 after HBD salt
+h-point: mean=   2.7616584678934446E-02 min=   0.0000000000000000E+00 max=   3.9793096328866157E-02 before HBD salt
+h-point: c= 325452834 before HBD salt
+h-point: mean=   2.7616584680135257E-02 min=   0.0000000000000000E+00 max=   3.9793096328866157E-02 after HBD salt
+h-point: c= 325454711 after HBD salt
 h-point: mean=  -1.8060261735762620E-01 min=  -1.0000000000000000E+00 max=   1.1415525114155259E-04 before HBD age
 h-point: c= 343965535 before HBD age
 h-point: mean=  -1.8060261735762584E-01 min=  -1.0000000000000000E+00 max=   1.1415581285044829E-04 after HBD age
  • Z_RESCALE_POWER = 10: no diffs in log output
  • H_RESCALE_POWER = 10: no diffs in log output

I did not see these differences when running single-column MARBL tests using solo_driver, but it's not clear whether that means the missing scaling is in the NUOPC cap or if it has to do with the different parameterizations we are using. Given the bit-for-bit history file output, though, it does seem likely that the problem is a missing scale argument in the hchksum() call rather than an actual scaling issue (although the MEKE diff not showing up until the checksum is confusing)

@mnlevy1981
Copy link
Collaborator Author

Neither Q_RESCALE_POWER = 10 nor R_RESCALE_POWER = 10 show log differences, so it's just T, L, C, and S.

@Hallberg-NOAA
Copy link

There is a new pull request (NOAA-GFDL#620, that is headed to main via dev/gfdl) that might partially address this issue. It corrects some bugs (including dimensional rescaling factors involving Z and L) in the calculation of the MEKE source terms in a form that I believe is being used at NCAR.

There is a second recent pull request (NOAA-GFDL#601, also headed to main via dev/gfdl) that adds comments noting several dimensional inconsistencies within vertFPmix() and also some checksums that are missing the appropriate scale arguments. This PR notes the problems but does not correct them because we are not using this subroutine at GFDL, and we wanted to leave it to NCAR to decide how best to correct these issues (e.g., with a ..._BUG flag or just fix them) with minimal disruptions to your ongoing MOM6 simulations.

I suspect that these two pull requests might go a long way toward addressing this issue.

@mnlevy1981
Copy link
Collaborator Author

Thanks @Hallberg-NOAA, this is very helpful! Since these issues are only in the chksum output and don't affect ocean.stats or what is written to history files, the CESM-based dimensional scaling tests all pass. I think this means (1) we're happy to fix things inline without a _BUG flag, and (2) this is a fairly low priority, so I'm happy to wait for a few updates to trickle onto dev/ncar from dev/gfdl before (eventually) fixing issues noted in NOAA-GFDL#601

Hallberg-NOAA added a commit to Hallberg-NOAA/MOM6 that referenced this issue Jul 14, 2024
  Added missing scale arguments to the hchksum and global_mass_integral calls
for debugging in hor_bnd_diffusion, so that they now give messages to stdout
that do not change when tracers (including temperature and salinity) are
rescaled.  Also added a missing debuggingParam argument to the get_Param call
for HBD_DEBUG so that is will be logged in MOM_parameter_doc.debugging rather
that MOM_parameter_doc.all.  This commit partially addresses the scaling
problems that were noted in github.com/NCAR/issues/275.  All solutions are
bitwise identical, but some debugging output can change to become more robust.
alperaltuntas pushed a commit that referenced this issue Jul 16, 2024
  Added missing scale arguments to the hchksum and global_mass_integral calls
for debugging in hor_bnd_diffusion, so that they now give messages to stdout
that do not change when tracers (including temperature and salinity) are
rescaled.  Also added a missing debuggingParam argument to the get_Param call
for HBD_DEBUG so that is will be logged in MOM_parameter_doc.debugging rather
that MOM_parameter_doc.all.  This commit partially addresses the scaling
problems that were noted in github.com//issues/275.  All solutions are
bitwise identical, but some debugging output can change to become more robust.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants