Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support global-workflow using Rocky 8 on CSPs #2998

Open
wants to merge 25 commits into
base: develop
Choose a base branch
from

Conversation

weihuang-jedi
Copy link
Contributor

@weihuang-jedi weihuang-jedi commented Oct 10, 2024

Description

With ParallelWorks now default Rocky 8 on CSPs, and move to Rocky 8 only after 1/1/2025,
we need to modify global-workflow module files to use Rocky 8 supported spack-stack,
and test compile and run to make sure all works under Rocky 8.

i) Rocky 8 update new features:

a. Wave worked in C48_S2SWA_gefs case, so turn SUPPORT_WAVES to "YES" in awspw.yaml.
Actually, if we did not set SUPPORT_WAVES to "YES", setup_expt.py will rise exception.

b. Using two type of nodes (chips/queues) on AWS, compute/process, where forecasts run in "compute" queue,
which is a big node (more cores), others run in "process" queue, which has small node (less cores).

ii) Rocky 8 update needs the following submodules at or newer than the tags below.

  1. gfs_utils:

commit 4848ecbb5e713b16127433e11f7d3edc6ac784c4 (HEAD, origin/develop, origin/HEAD, develop)
Author: Wei Huang [email protected]
Date: Fri Oct 18 10:41:25 2024 -0600

Make gfs-utils compile on CSPs with Rocky 8 (#81)

Support Rocky 8 on CSPs.
  1. ufs_utils:

commit 23237610845c3a4438b21b25e9b3dc25c4c15b73 (HEAD)
Author: Wei Huang [email protected]
Date: Wed Oct 9 11:55:13 2024 -0600

Support UFS_UTILS on CSPs under Rocky 8 (#989)

Fixes #982.
  1. upp:

commit 66a422db80ea129dd87285fe6e811d4b6e1fe29b (HEAD)
Author: Wei Huang [email protected]
Date: Wed Oct 2 14:38:22 2024 -0600

Make UPP works with Rocky 8 on CSPs (#1034)

* Make UPP works with Rocky 8 on CSPs

* Remove unneeded path

* simplify modulefile
  1. ufs_model:

commit 29c2703c715ebdb47bbd4bcc811db340eae530e5 (HEAD)
Author: Cameron Book [email protected]
Date: Tue Nov 12 13:08:12 2024 -0800

Add developmental test cases: idealized baroclinic wave and 2020 July CAPE cases + https://github.com/ufs-community/ufs-weather-model/pull/2459 (#2461)

* UFSWM - Add tests-dev ATM-only idealized dry baroclinic wave test and a 2020 July CAPE case
* UFSWM - Update modulefile to support Rocky 8 on CSPs, with ParallelWorks

---------

Co-authored-by: Wei Huang <[email protected]>
Co-authored-by: Jong Kim <[email protected]>

Resolves #2997

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

How has this been tested?

  • Clone and build on CSPs
  • Forecast-only on AWS
  • GEFS test on AWS

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • I have made corresponding changes to the system documentation if necessary

env/AZUREPW.env Fixed Show fixed Hide fixed
env/AZUREPW.env Fixed Show fixed Hide fixed
env/AZUREPW.env Fixed Show fixed Hide fixed
env/GOOGLEPW.env Fixed Show fixed Hide fixed
env/GOOGLEPW.env Fixed Show fixed Hide fixed
env/GOOGLEPW.env Fixed Show fixed Hide fixed
@weihuang-jedi
Copy link
Contributor Author

Show tasks run on different partitions: compute and process

         JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
            92   compute c48gefs_ Wei.Huan  R       9:04      1 weihuang-weisfsemcaws-00056-1-0002
            84   compute c48gefs_ Wei.Huan  R      10:03      1 weihuang-weisfsemcaws-00056-1-0003
            85   compute c48gefs_ Wei.Huan  R      10:03      1 weihuang-weisfsemcaws-00056-1-0004
            12   compute c48atm_g Wei.Huan  R      23:45      1 weihuang-weisfsemcaws-00056-1-0001
           164   process c48gefs_ Wei.Huan CF       0:03      1 weihuang-weisfsemcaws-00056-2-0019
           165   process c48gefs_ Wei.Huan CF       0:03      1 weihuang-weisfsemcaws-00056-2-0020
           148   process c48gefs_ Wei.Huan CF       2:04      1 weihuang-weisfsemcaws-00056-2-0018
           158   process c48gefs_ Wei.Huan  R       0:03      1 weihuang-weisfsemcaws-00056-2-0030
           159   process c48gefs_ Wei.Huan  R       0:03      1 weihuang-weisfsemcaws-00056-2-0001
           160   process c48gefs_ Wei.Huan  R       0:03      1 weihuang-weisfsemcaws-00056-2-0002
           161   process c48gefs_ Wei.Huan  R       0:03      1 weihuang-weisfsemcaws-00056-2-0003
           162   process c48gefs_ Wei.Huan  R       0:03      1 weihuang-weisfsemcaws-00056-2-0004
           163   process c48gefs_ Wei.Huan  R       0:03      1 weihuang-weisfsemcaws-00056-2-0005
           155   process c48atm_g Wei.Huan  R       0:05      1 weihuang-weisfsemcaws-00056-2-0017
           156   process c48atm_g Wei.Huan  R       0:05      1 weihuang-weisfsemcaws-00056-2-0022
           157   process c48gefs_ Wei.Huan  R       0:05      1 weihuang-weisfsemcaws-00056-2-0029

modulefiles/module_base.noaacloud.lua Show resolved Hide resolved
parm/config/gfs/config.base Outdated Show resolved Hide resolved
@WalterKolczynski-NOAA
Copy link
Contributor

WalterKolczynski-NOAA commented Nov 25, 2024

This UFS model hash is before the one that introduced the issue with building wave pre/post executables (see #3110), so it should be safe to update.

@@ -27,5 +27,5 @@ MAKE_ACFTBUFR: 'NO'
DO_TRACKER: 'NO'
DO_GENESIS: 'NO'
DO_METP: 'NO'
SUPPORT_WAVES: 'NO'
SUPPORTED_RESOLUTIONS: ['C48', 'C96'] # TODO: Test and support all cubed-sphere resolutions.
SUPPORT_WAVES: 'YES'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since WW3 will now be supported on all platforms, I think we can remove SUPPORT_WAVES from the host files and setup_expt.py.

Suggested change
SUPPORT_WAVES: 'YES'

DO_TRACKER: 'NO'
DO_GENESIS: 'NO'
DO_METP: 'NO'
SUPPORT_WAVES: 'YES'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SUPPORT_WAVES: 'YES'

DO_TRACKER: 'NO'
DO_GENESIS: 'NO'
DO_METP: 'NO'
SUPPORT_WAVES: 'YES'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SUPPORT_WAVES: 'YES'

@DavidHuber-NOAA
Copy link
Contributor

This block

supp_waves = host.info.get('SUPPORT_WAVES', 'YES')
machine = host.machine
for attr in ['resdetatmos', 'resensatmos']:
try:
expt_res = f'C{getattr(inputs, attr)}'
except AttributeError:
continue
if expt_res not in supp_res:
raise NotImplementedError(f"Supported resolutions on {machine} are:\n{', '.join(supp_res)}")
if "W" in inputs.app and supp_waves == "NO":
raise NotImplementedError(f"Waves are not supported on {machine}")

Can be simplified to

 machine = host.machine 
 for attr in ['resdetatmos', 'resensatmos']: 
     try: 
         expt_res = f'C{getattr(inputs, attr)}' 
     except AttributeError: 
         continue 
     if expt_res not in supp_res: 
         raise NotImplementedError(f"Supported resolutions on {machine} are:\n{', '.join(supp_res)}") 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support global-workflow on CSPs with Rocky 8
3 participants