Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrapper PR for: Thompson inner loop, Thompson subcycling bugfix, remove snet from noah lsm, fix time dimension in restart files, rt.sh bugfix for PBS, and more! #702

Conversation

climbfuji
Copy link
Collaborator

@climbfuji climbfuji commented Jul 22, 2021

PR Checklist

  • Ths PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.

  • This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR

  • An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
    are specified below.

  • [n/a] If new or updated input data is required by this PR, it is clearly stated in the text of the PR.

Description

This PR combines many changes:

  • update of submodule pointers for fv3atm and ccpp-physics:
  • Bug fix in rt_utils.sh to correctly detect job failures on Cheyenne with PBS, turn off coupled debug compilation/tests for GNU. which now fail, same for test control_csawmg; from @DeniseWorthen and @DusanJovic-NOAA
  • Turn on several debug tests in rt.conf and rt_gnu.conf from @junwang-noaa
    • Note. control_stochy_debug and control_ca_debug still not working with GNU
  • Don't create baseline for control_stochy_restart run (in rt.conf) from @junwang-noaa

The changes in fv3atm and ccpp-physics do change the results of all regression tests, new baseline date is 20210721.

Issue(s) addressed

Fixes #697

Testing

How were these changes tested? What compilers / HPCs was it tested with? Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) Have regression tests and unit tests (utests) been run? On which platforms and with which compilers? (Note that unit tests can only be run on tier-1 platforms)

  • hera.intel
  • hera.gnu
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss_cray
  • wcoss_dell_p3
  • CI

Dependencies

NCAR/ccpp-physics#702
NOAA-EMC/fv3atm#350
#702

…_utils.sh to crrectly detect failed jobs on Cheyenne, comment out coupled debug tests in rt_gnu.conf, update BL_DATE in rt.sh
@climbfuji climbfuji changed the title Wrapper PR for: Thompson inner loop, Thompson subcycling bugfix, remove snet from noah lsm, fix time dimension in restart files, rt.sh bugfix for PBS Wrapper PR for: Thompson inner loop, Thompson subcycling bugfix, remove snet from noah lsm, fix time dimension in restart files, rt.sh bugfix for PBS, and more! Jul 22, 2021
@climbfuji climbfuji marked this pull request as ready for review July 22, 2021 16:19
@climbfuji climbfuji added Baseline Updates Current baselines will be updated. Waiting for Reviews The PR is waiting for reviews from associated component PR's. run-ci labels Jul 22, 2021
@github-actions github-actions bot removed the run-ci label Jul 22, 2021
@climbfuji climbfuji self-assigned this Jul 22, 2021
@climbfuji
Copy link
Collaborator Author

CI tests reported failure of:

FAILED TESTS: 
Test rst cpld_control failed 
UNIT TEST FAILED

@@ -64,8 +66,8 @@ RUN | fv3_esg_HAFS_v0_hwrf_thompson_debug
COMPILE | -DAPP=S2S -DCCPP_SUITES=FV3_GFS_2017_coupled | | fv3 |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This compile will also fail on cheyenne.gnu. Should we just add -chyenne.gnu to cpld_control, cpld_debug and ng-gdas-nemsdatm since they compile and run on hera.gnu

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is what I did in my first round of bug fixes - thanks. Let's see if the bug fixes in FV3GFS_io.F90 resolve the restart issue for the coupled tests.

@climbfuji
Copy link
Collaborator Author

CI tests passed for 5816815 - means the coupled control restart run is fixed.

@junwang-noaa @pjpegion @lisa-bengtsson I tried to add the stochy and ca debug tests for the GNU compiler. Both tests crashed on Cheyenne with GNU 10.

control_stochy_debug

0: in fcst,init total time:    29.863434842787683
90:At line 614 of file /glade/work/heinzell/fv3/ufs-weather-model/ufs-weather-model-emc-develop-20210721-thompson-noah-lsm-updates/gnu/stochastic_physics/get_stochy_pattern.F90
90:Fortran runtime error: Index '1' of dimension 1 of array 'trie_di' above upper bound of 0
90:
90:Error termination. Backtrace:

control_ca_debug:

75:
75:Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.
75:
75:Backtrace for this error:
75:#0  0x2b8423e74aff in ???
75:#1  0x21dee73 in __update_ca_MOD_update_cells_sgs
75:	at /glade/work/heinzell/fv3/ufs-weather-model/ufs-weather-model-emc-develop-20210721-thompson-noah-lsm-updates/gnu/stochastic_physics/update_ca.F90:249
75:#2  0x21d3ef7 in __cellular_automata_sgs_mod_MOD_cellular_automata_sgs
75:	at /glade/work/heinzell/fv3/ufs-weather-model/ufs-weather-model-emc-develop-20210721-thompson-noah-lsm-updates/gnu/stochastic_physics/cellular_automata_sgs.F90:294
75:#3  0x212e274 in __stochastic_physics_wrapper_mod_MOD_stochastic_physics_wrapper
75:	at /glade/work/heinzell/fv3/ufs-weather-model/ufs-weather-model-emc-develop-20210721-thompson-noah-lsm-updates/gnu/FV3/stochastic_physics/stochastic_physics_wrapper.F90:347
75:#4  0x20d190e in __atmos_model_mod_MOD_atmos_model_init
75:	at /glade/work/heinzell/fv3/ufs-weather-model/ufs-weather-model-emc-develop-20210721-thompson-noah-lsm-updates/gnu/FV3/atmos_model.F90:727
75:#5  0x1ee9ee8 in fcst_initialize
75:	at /glade/work/heinzell/fv3/ufs-weather-model/ufs-weather-model-emc-develop-20210721-thompson-noah-lsm-updates/gnu/FV3/module_fcst_grid_comp.F90:397
75:#6  0xd55de2 in _ZN5ESMCI6FTable12callVFuncPtrEPKcPNS_2VMEPi
75:	at /glade/p/ral/jntp/GMTB/tools/hpc-stack-v1.1.0/src/hpc-stack-gnu-10.1.0/pkg/ESMF_8_1_1/src/Superstructure/Component/src/ESMCI_FTable.C:2036

The tests didn't fail with Intel. I am going to comment them out again in rt_gnu.conf, hopefully you will be able to fix those in one of your next commits.

@climbfuji
Copy link
Collaborator Author

Regression testing complete.

@climbfuji climbfuji added the Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. label Jul 23, 2021
@climbfuji
Copy link
Collaborator Author

Ready for commit whenever you are. If you want to run CI tests again, please let me know. Thanks!

@MinsukJi-NOAA
Copy link
Contributor

Ready for commit whenever you are. If you want to run CI tests again, please let me know. Thanks!

I don't think we need to run CI again.

@junwang-noaa junwang-noaa merged commit c413ccf into ufs-community:develop Jul 23, 2021
BinLiu-NOAA added a commit to hafs-community/HAFS that referenced this pull request Aug 13, 2021
Sync HAFS submodules with their corresponding authoritative branches:
- hafs_forecast.fd as of 08/05/2021
- hafs_gsi.fd as of 08/06/2021 plus the dual-resolution 3DEnVar bug fix
- hafs_post.fd as of 08/02/2021
- hafs_utils.fd as of  as of 07/23/2021
- hafs_graphics.fd/hrd_gplot as of 08/10/2021
- hafs_graphics.fd/emc_graphics as of 08/10/2021
Besides application level changes were made accordingly with the updated submodules.

This PR addresses issue #80.

Notes:
- The [bug of wrong Time dimension in FV3 restart sfc_data files](NOAA-EMC/fv3atm#344) has been fixed in ufs-weather-model through this ufs-weather-model [PR](ufs-community/ufs-weather-model#702). 
- The bug fix in for the dual-resolution EnVar analysis in GSI (hafs_gsi.fd) contributed from OU collaborators has also been included in this PR. With that the HAFS ENSDA configurations can now work properly.
- As for the hafs_forecast.fd submodule, support/HAFS branch is identical to the ufs-weather-model develop branch as of 08/05/2021. More information can be found through this PR (ufs-community/ufs-weather-model#715)
epic-cicd-jenkins pushed a commit that referenced this pull request Apr 17, 2023
…hysics seed generation script (#704)

## DESCRIPTION OF CHANGES: 
Add missing user-defined options for tendency-based stochastic physics and fix the ensemble-based seed generation script to work regardless of whether stochastic physics is turned on or not.

## TESTS CONDUCTED: 
Tested on Hera using the following WE2E configurations with and without stochastic physics:

config.grid_RRFS_CONUS_3km_ics_HRRR_lbcs_RAP_suite_HRRR.sh
config.community_ensemble_2mems.sh

## ISSUE (optional): 
[Issue #702](ufs-community/regional_workflow#702)

## CONTRIBUTORS (optional): 
Thanks to @mkavulich and @chan-hoo for finding this problem.
epic-cicd-jenkins pushed a commit that referenced this pull request Apr 17, 2023
)

Updating S3 artifact bucket to point to new OAR account

Co-authored-by: ClimaBot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Baseline Updates Current baselines will be updated. Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. Waiting for Reviews The PR is waiting for reviews from associated component PR's.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

App=S2S does not compile with Cheyenne.gnu; failure is not reported in regression test
7 participants