Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Noah-WRFv4, bug fix in Thompson MP inner loop, update HWRF regression tests, set ECF_TRIES to 2, update Cheyenne job submission scripts; contains "Reduce memory required by MERRA2 option; fix diag_tables for P7 tests; adds C384 P7 tests to Cheyenne.intel" (#866), contains "Update CMEPS" (#775) #831

Merged

Conversation

climbfuji
Copy link
Collaborator

@climbfuji climbfuji commented Sep 24, 2021

PR Checklist

  • Ths PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.

  • This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR

  • An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
    are specified below.

  • If new or updated input data is required by this PR, it is clearly stated in the text of the PR.

Description

This PR removes unused/broken HAFS-HWRF regression tests that used Ferrier-Aligo MP, and updates the old HAFS-HWRF regression tests to use Noah MP instead of Noah-WRFv4 as instructed by @yangfanglin. It is understood that these old HAFS-HWRF regression tests will eventually be replaced with new HAFS regression tests.

The submodule pointers for fv3atm and ccpp-physics are updated to remove Noah-WRFv4 from fv3atm/ccpp-physics and to fix a bug in Thompson MP when the inner loop logic is used.

The changes in this PR and associated PRs require new baselines for the following four tests:

fv3_HAFS_v0_hwrf_thompson_debug
fv3_HAFS_v0_hwrf_thompson
fv3_esg_HAFS_v0_hwrf_thompson_debug
fv3_esg_HAFS_v0_hwrf_thompson

All other regression tests are b4b reproducible against the current baselines (tested on Hera with Intel and GNU). Note in particular that the Thompson MP results do not change, because the inner loop logic is not used in the regression tests.

The PR also contains an update from @DusanJovic-NOAA that uses the automatic rerun capability for ecflow: ECF_TRIES is set to 2 in this PR, which will attempt to rerun a failed test once before reporting a failure. This is to avoid unnecessary reruns of the entire regression test suite if single tests timed out.

Further, this PR as of 2021/10/28 contains the following changes:

No new input data required.

Issue(s) addressed

Testing

Tested on Hera/Intel on 2021/10/25 against official baseline, only the four tests mentioned above failed with b4b differences.

How were these changes tested? What compilers / HPCs was it tested with? Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) Have regression tests and unit tests (utests) been run? On which platforms and with which compilers? (Note that unit tests can only be run on tier-1 platforms)

  • hera.intel
  • hera.gnu
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss_cray
  • wcoss_dell_p3
  • CI - f55e550

Dependencies

DeniseWorthen and others added 30 commits March 27, 2021 12:30
This reverts commit 7b826d4.
@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Oct 29, 2021

I had a compile failure for the hafs_regional_atm and hafs_regional_atm_ocn test on cray because this PR requires esmf8.2bs14. The update to 8.2bs14 was done in PR #776 but it looks like the module list was not updated on cray. This PR adds a new NUOPC call in CMEPS which now fails using the older 811 esmf.

I added the library in the ufs_wcoss_cray

diff --git a/modulefiles/ufs_wcoss_cray b/modulefiles/ufs_wcoss_cray
index 01ca6e76..9586ca24 100644
--- a/modulefiles/ufs_wcoss_cray
+++ b/modulefiles/ufs_wcoss_cray
@@ -63,7 +63,7 @@ module load gni-headers
 module load udreg
 module load ugni

-module load esmf/811
+module load esmf/820bs14
 module load fms/2021.03

 module swap pmi pmi/5.0.11

In the compile_001/err I see as the first occurence of ESMF:

+ export ESMFMKFILE=/gpfs/hps/usrx/local/nceplibs/NCEPLIBS/cmake/install/NCEPLIBS-v1.3.0/esmf/esmf-820bs14/lib/esmf.mk

But then:

CMake Error at /gpfs/hps3/emc/nems/noscrub/emc.nemspara/soft/cmake/cmake-3.20.1-linux-x86_64/share/cmake-3.20/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find ESMF (missing: ESMF_LIBRARY_LOCATION) (found version
  "8.2.0")

@junwang-noaa
Copy link
Collaborator

junwang-noaa commented Oct 29, 2021 via email

@climbfuji
Copy link
Collaborator Author

Denise, thanks for catching the wrong esmf lib on cray. Let's check with nceplibs team if esmf820bs14 is installed on cray

Yes, thanks Denise, also for running the wcoss tests in general. After cray is done, we are ready to merge. That was a quick one then, despite creating new baselines.

@DeniseWorthen
Copy link
Collaborator

Just started the cray verify step.

@DeniseWorthen
Copy link
Collaborator

CMEPS has been merged; hash 5beead0

@climbfuji
Copy link
Collaborator Author

CMEPS has been merged; hash 5beead0

CMEPS submodule pointer updated, waiting for fv3atm to be merged

.gitmodules Outdated Show resolved Hide resolved
.gitmodules Outdated Show resolved Hide resolved
@climbfuji
Copy link
Collaborator Author

fv3atm has is correct (4214e6d), waiting to resolve a .gitmodules change question. Please all review carefully once more, many unrelated changes.

@DeniseWorthen
Copy link
Collaborator

Double checked again. All my changes are present as expected. Thanks for pulling all these in together.

Copy link
Collaborator

@junwang-noaa junwang-noaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for including the ECF_TRIES update, hope it will resolve RT test problem caused by hera slurm issue.

@climbfuji
Copy link
Collaborator Author

Should be good now!

Copy link
Contributor

@BinLiu-NOAA BinLiu-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@junwang-noaa junwang-noaa merged commit e1cfb05 into ufs-community:develop Oct 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Baseline Updates Current baselines will be updated. Waiting for Reviews The PR is waiting for reviews from associated component PR's.
Projects
None yet
6 participants