Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump Python to 3.11.7 #1217

Merged
merged 18 commits into from
Aug 24, 2024
Merged

Conversation

climbfuji
Copy link
Collaborator

@climbfuji climbfuji commented Aug 5, 2024

Summary

Bump python to 3.11.7

Testing

Applications affected

All

Systems affected

All

Dependencies

Issue(s) addressed

Resolves #1038

Checklist

  • This PR addresses one issue/problem/enhancement, or has a very good reason for not doing so.
  • These changes have been tested on the affected systems and applications.
  • All dependency PRs/issues have been resolved and this PR can be merged.

@climbfuji climbfuji self-assigned this Aug 5, 2024
@ashley314
Copy link
Collaborator

When testing on S4, I noticed skylab referencing ecflow's python3.9 /data/prod/jedi/spack-stack/ecflow-5.8.4/lib/python3.9/site-packages/ecflow/__init__.py. Does that get upgraded as well? Although it did not seem to make a difference when we used python 3.10.

@climbfuji
Copy link
Collaborator Author

When testing on S4, I noticed skylab referencing ecflow's python3.9 /data/prod/jedi/spack-stack/ecflow-5.8.4/lib/python3.9/site-packages/ecflow/__init__.py. Does that get upgraded as well? Although it did not seem to make a difference when we used python 3.10.

Thanks for testing this @ashley314 . The ecflow server was built as an external package (because of Intel) with an external Python. This is not going to change unless it breaks with the update to Python 3.11.7 (I sincerely hope not).

This year, we'll all need to move away from the classic Intel compilers (at least the C/C++ compilers) and use the new LLVM-based compilers (icx, icpx, ...). For those, we don't need to build ecflow as an external package anymore - yay!

@DavidHuber-NOAA
Copy link
Collaborator

@climbfuji One issue I see for EMC_verif-global is that MET and METplus are version 11.1.1 and 5.1.0, respectively, while we need version 9.1.3 and 3.1.1. I will try out a test installation with both pairs of MET and METplus. If successful, should I open a PR to your branch or open a new PR?

@climbfuji
Copy link
Collaborator Author

@climbfuji One issue I see for EMC_verif-global is that MET and METplus are version 11.1.1 and 5.1.0, respectively, while we need version 9.1.3 and 3.1.1. I will try out a test installation with both pairs of MET and METplus. If successful, should I open a PR to your branch or open a new PR?

A new PR please. It would be really nice if we could use one somewhat recent set of MET and METplus, though!

@srherbener
Copy link
Collaborator

@ashley314 The ctest failures are very likely unrelated to this PR, since Python is used only for a tiny subset of all ctests in jedi-bundle. I am wondering if you are seeing the nlohman-json problems that @srherbener fixed end of last week in oops.

I was running into the nlohmna-json problems earlier, but after the oops PR was merged it hasn't been an issue. At least on S4.

I've seen the same on an AWS instance that I'm testing on. After the oops PR got merged, no more issues with nlohmann-json packages.

@DavidHuber-NOAA
Copy link
Collaborator

@climbfuji @ulmononian I receive the following errors when attempting to build the UFS from within the global-workflow (from /scratch1/NCEPDEV/global/David.Huber/GW/gw_python_3117/sorc/logs/build_ufs.log). It looks like MAPL is not syncing correctly with ESMF:

> ./tests/compile.sh hera '-DAPP=S2SWA -D32BIT=ON -DCCPP_SUITES=FV3_GFS_v17_p8_ugwpv1,FV3_GFS_v17_coupled_p8_ugwpv1,FV3_global_nest_v1 -DPDLIB=ON' 0 intel YES NO
...
-- Found PTSCOTCHparmetis: /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/scotch-7.0.4-c2fm4mk/lib/libptscotchparmetisv3.a
Building MOM6 standalone executable
-- Configuring done (52.8s)
CMake Error at /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/MAPL-targets.cmake:70 (set_target_properties):
  The link interface of target "MAPL_cfio_r4" contains:

    ESMF::ESMF

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/mapl-config.cmake:74 (include)

CMake Error at /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/MAPL-targets.cmake:122 (set_target_properties):
  The link interface of target "MAPL.field_utils" contains:

    ESMF::ESMF

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/mapl-config.cmake:74 (include)
  GOCART/CMakeLists.txt:70 (find_package)


CMake Error at /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/MAPL-targets.cmake:153 (set_target_properties):
  The link interface of target "MAPL.base" contains:

    ESMF::ESMF

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/mapl-config.cmake:74 (include)
  GOCART/CMakeLists.txt:70 (find_package)


CMake Error at /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/MAPL-targets.cmake:164 (set_target_properties):
  The link interface of target "MAPL" contains:

    ESMF::ESMF

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/mapl-config.cmake:74 (include)
  GOCART/CMakeLists.txt:70 (find_package)


CMake Error at /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/MAPL-targets.cmake:180 (set_target_properties):
  The link interface of target "MAPL.cap" contains:

    ESMF::ESMF

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/mapl-config.cmake:74 (include)
  GOCART/CMakeLists.txt:70 (find_package)


CMake Error at /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/MAPL-targets.cmake:188 (set_target_properties):
  The link interface of target "MAPL.history" contains:

    ESMF::ESMF

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/mapl-config.cmake:74 (include)

  GOCART/CMakeLists.txt:70 (find_package)


CMake Error at /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/MAPL-targets.cmake:196 (set_target_properties):
  The link interface of target "MAPL.orbit" contains:

    ESMF::ESMF

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/mapl-config.cmake:74 (include)
  GOCART/CMakeLists.txt:70 (find_package)


CMake Error at /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/MAPL-targets.cmake:204 (set_target_properties):
  The link interface of target "MAPL.ExtData" contains:

    ESMF::ESMF

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/mapl-config.cmake:74 (include)
  GOCART/CMakeLists.txt:70 (find_package)


CMake Error at /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/MAPL-targets.cmake:212 (set_target_properties):
  The link interface of target "MAPL.ExtData2G" contains:

    ESMF::ESMF

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/mapl-config.cmake:74 (include)
  GOCART/CMakeLists.txt:70 (find_package)


CMake Error at /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/MAPL-targets.cmake:220 (set_target_properties):
  The link interface of target "MAPL.griddedio" contains:

    ESMF::ESMF

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /scratch1/NCEPDEV/nems/Alexander.Richert/spack-stack-py311-aug24/envs/test/install/intel/2021.5.0/mapl-2.46.2-anh2zzb/lib64/cmake/MAPL/mapl-config.cmake:74 (include)
  GOCART/CMakeLists.txt:70 (find_package)


-- Generating done (1.3s)
CMake Generate step failed.  Build files cannot be regenerated correctly.

Here are the changes I made to the ufs_common.lua module file:

 local ufs_modules = {
   {["jasper"]          = "2.0.32"},
-  {["zlib"]            = "1.2.13"},
+  {["zlib-ng"]         = "2.1.6"},
   {["libpng"]          = "1.6.37"},
-  {["hdf5"]            = "1.14.0"},
+  {["hdf5"]            = "1.14.3"},
   {["netcdf-c"]        = "4.9.2"},
   {["netcdf-fortran"]  = "4.6.1"},
-  {["parallelio"]      = "2.5.10"},
-  {["esmf"]            = "8.6.0"},
+  {["parallelio"]      = "2.6.2"},
+  {["esmf"]            = "8.6.1"},
   {["fms"]             = "2023.04"},
   {["bacio"]           = "2.4.1"},
-  {["crtm"]            = "2.4.0"},
-  {["g2"]              = "3.4.5"},
+  {["crtm"]            = "2.4.0.1"},
+  {["g2"]              = "3.4.9"},
   {["g2tmpl"]          = "1.10.2"},
-  {["ip"]              = "4.3.0"},
+  {["ip"]              = "5.0.0"},
   {["sp"]              = "2.5.0"},
   {["w3emc"]           = "2.10.0"},
-  {["gftl-shared"]     = "1.6.1"},
-  {["mapl"]            = "2.40.3-esmf-8.6.0"},
+  {["gftl-shared"]     = "1.9.0"},
+  {["mapl"]            = "2.46.2-esmf-8.6.1"},
   {["scotch"]          = "7.0.4"},
 }

@RatkoVasic-NOAA
Copy link
Collaborator

@DavidHuber-NOAA
I managed to compile ufs-wm with your ufs_hera.intel.lua and ufs_common.lua
(/scratch2/NCEPDEV/fv3-cam/Ratko.Vasic/brisi/xxx-ufs-weather-model/modulefiles)
Maybe your WM version is old (Fri Jul 19 19:27:11), I used current development branch (Wed Aug 21 08:23:16).
Here is compile directory:
/scratch1/NCEPDEV/stmp2/Ratko.Vasic/FV3_RT/rt_1531463/compile_atm_dyn32_intel

@climbfuji
Copy link
Collaborator Author

There is a veeeeery long thread in the ufs-weather-model repo regarding this problem. Fixing this requires mapl 2.46.3 instead of 2.46.2 (we already made this update in spack-stack develop), and also an update to GOCART (outside of spack-stack). Maybe some more code changes in the ufs-weather-model itself. See ufs-community/ufs-weather-model#2399 as an entry point.

@DavidHuber-NOAA
Copy link
Collaborator

Great, thanks for the heads up @RatkoVasic-NOAA @climbfuji.

@RatkoVasic-NOAA
Copy link
Collaborator

@DavidHuber-NOAA @climbfuji @ulmononian
I just ran WM with new python using current development branch of the UFS_WM.
This is conf file:

COMPILE | atm_dyn32 | intel | -DAPP=ATM -DCCPP_SUITES=FV3_GFS_v16,FV3_GFS_v16_flake,FV3_GFS_v17_p8,FV3_GFS_v17_p8_rrtmgp,FV3_GFS_v15_thompson_mynn_lam3km,FV3_WoFS_v0,FV3_GFS_v17_p8_mynn,FV3_GFS_v17_p8_ugwpv1 -D32BIT=ON | | fv3 |
RUN | control_c48                                       |                                      | baseline |
RUN | control_c192                                      | - noaacloud                          | baseline |
RUN | control_p8                                        | - noaacloud                          | baseline |
RUN | regional_control                                  |                                      | baseline |

It ran and passed ATM tests
One file in one test is not bit identical, but that one is in 00hr, and 06 hour file is OK, so it can be replaced:

 Comparing PRSLEV.GrbF00 .....USING CMP......OK
 Comparing PRSLEV.GrbF06 .....USING CMP......OK
 Comparing NATLEV.GrbF00 .....USING CMP......NOT IDENTICAL
 Comparing NATLEV.GrbF06 .....USING CMP......OK

I used Dave's ufs_hera.intel.lua and ufs_common.lua from Hera: /scratch1/NCEPDEV/global/David.Huber/GW/gw_python_3117/sorc/ufs_model.fd/modulefiles

@DavidHuber-NOAA
Copy link
Collaborator

All global-workflow submodules build with the exception of gfs-utils. There appears to be a duplicate skgb.f module in g2 v3.4.9 and w3emc v2.10.0. I'm guessing the CMakeLists.txt file will need to be updated for gfs-utils. Opened issue NOAA-EMC/gfs-utils#76 to track.

@DavidHuber-NOAA
Copy link
Collaborator

Easy fix. Just had to update the CMakeLists files for two executables.

@DavidHuber-NOAA
Copy link
Collaborator

All submodules of the global-workflow built successfully 🎉

@ulmononian
Copy link
Collaborator

@

@DavidHuber-NOAA @climbfuji @ulmononian I just ran WM with new python using current development branch of the UFS_WM. This is conf file:

COMPILE | atm_dyn32 | intel | -DAPP=ATM -DCCPP_SUITES=FV3_GFS_v16,FV3_GFS_v16_flake,FV3_GFS_v17_p8,FV3_GFS_v17_p8_rrtmgp,FV3_GFS_v15_thompson_mynn_lam3km,FV3_WoFS_v0,FV3_GFS_v17_p8_mynn,FV3_GFS_v17_p8_ugwpv1 -D32BIT=ON | | fv3 |
RUN | control_c48                                       |                                      | baseline |
RUN | control_c192                                      | - noaacloud                          | baseline |
RUN | control_p8                                        | - noaacloud                          | baseline |
RUN | regional_control                                  |                                      | baseline |

It ran and passed ATM tests One file in one test is not bit identical, but that one is in 00hr, and 06 hour file is OK, so it can be replaced:

 Comparing PRSLEV.GrbF00 .....USING CMP......OK
 Comparing PRSLEV.GrbF06 .....USING CMP......OK
 Comparing NATLEV.GrbF00 .....USING CMP......NOT IDENTICAL
 Comparing NATLEV.GrbF06 .....USING CMP......OK

I used Dave's ufs_hera.intel.lua and ufs_common.lua from Hera: /scratch1/NCEPDEV/global/David.Huber/GW/gw_python_3117/sorc/ufs_model.fd/modulefiles

this is great news. thanks @RatkoVasic-NOAA. just curious, where are your test dirs so i can take a look?

@DavidHuber-NOAA
Copy link
Collaborator

All GSI regression tests passed except hafs_4denvar_glbens, which failed due to runtime (runtime of 332s vs 276s expected). This is a common and non-fatal failure for this test. All tests were reproducible vs spack-stack 1.6.0.

@RatkoVasic-NOAA
Copy link
Collaborator

just curious, where are your test dirs so i can take a look?

@ulmononian :
script directory:
/scratch2/NCEPDEV/fv3-cam/Ratko.Vasic/brisi/xxx-ufs-weather-model/tests
run directory:
/scratch1/NCEPDEV/stmp2/Ratko.Vasic/FV3_RT/rt_3148604
log directory:
/scratch2/NCEPDEV/fv3-cam/Ratko.Vasic/brisi/xxx-ufs-weather-model/tests/logs/log_hera

Copy link
Collaborator

@RatkoVasic-NOAA RatkoVasic-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested:
#1217 (comment)
Approved

@rickgrubin-noaa
Copy link
Collaborator

Tested: #1217 (comment) Approved

Not a reviewer, so cannot approve the PR; able to duplicate @RatkoVasic-NOAA's results.

@climbfuji
Copy link
Collaborator Author

@rickgrubin-noaa Invited you as collaborator for both spack-stack and spack.

@DavidHuber-NOAA
Copy link
Collaborator

@malloryprow indicated that the version of matplotlib is not compatible with verif-global and could not generate a plot with it. Has this version of matplotlib been tested by others?

@climbfuji
Copy link
Collaborator Author

@malloryprow indicated that the version of matplotlib is not compatible with verif-global and could not generate a plot with it. Has this version of matplotlib been tested by others?

Not by us (NRL). I suppose it's the same situation as for JCSDA (and NRL's downstream plotting tools). Once the update is in, we need to go and fix our old plotting scripts?

Copy link
Collaborator

@eap eap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, I'll patch this into the docker PR and rebuild.

@climbfuji climbfuji enabled auto-merge (squash) August 23, 2024 21:15
@climbfuji climbfuji merged commit ac43a4c into JCSDA:develop Aug 24, 2024
8 checks passed
@climbfuji climbfuji deleted the feature/python_3p11p7 branch August 26, 2024 02:57
@DavidHuber-NOAA
Copy link
Collaborator

@climbfuji I'm not sure. According to her post (NOAA-EMC/EMC_verif-global#118 (comment)), she could not create a simple example plot (https://matplotlib.org/stable/gallery/lines_bars_and_markers/simple_plot.html). Hopefully we can work through the issue during the release candidate period.

DavidHuber-NOAA added a commit to DavidHuber-NOAA/spack-stack that referenced this pull request Aug 26, 2024
* jcsda/develop:
  Bump Python to 3.11.7 (JCSDA#1217)
  Add a clause in the cleanup to fix directory permissions (JCSDA#1273)
  Bug fix: configure neptune-env variants in three templates: neptune-dev, skylab-dev, unified-dev (JCSDA#1268)
  Update site configs for Atlantis, Narwhal, Nautilus (JCSDA#1266)
  Configuration for [email protected] and [email protected] (JCSDA#1240)
@climbfuji
Copy link
Collaborator Author

I am pretty sure that the matplotlib developers tested their release before they put it out there. Either it's a problem with the version combinations of the different packages we are using, or with the developer's code. Either way we'll need to figure this out during the release candidate period, as you said.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

[INSTALL]: Use Python 3.11 for unified-environment
8 participants