Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing WW3 Restart File Read in WW3 #97

Closed
ezhilsabareesh8 opened this issue Feb 9, 2024 · 4 comments · Fixed by #160
Closed

Failing WW3 Restart File Read in WW3 #97

ezhilsabareesh8 opened this issue Feb 9, 2024 · 4 comments · Fixed by #160
Assignees
Labels
priority:med ww3 Related to WW3

Comments

@ezhilsabareesh8
Copy link
Contributor

ezhilsabareesh8 commented Feb 9, 2024

In the MOM6-CICE6-WW3 IAF and RYF configs runs well for the initial run with a cold start. However, upon attempting to restart the simulation, an MPI error occurs here due to failing MPI_waitall function.

@ezhilsabareesh8 ezhilsabareesh8 self-assigned this Feb 9, 2024
@ezhilsabareesh8 ezhilsabareesh8 added the ww3 Related to WW3 label Feb 9, 2024
@ezhilsabareesh8 ezhilsabareesh8 changed the title Failing WW3 Restart File Read in MOM6-CICE6-WW3 IAF Configuration Failing WW3 Restart File Read in MOM6-CICE6-WW3 Feb 9, 2024
@ezhilsabareesh8 ezhilsabareesh8 changed the title Failing WW3 Restart File Read in MOM6-CICE6-WW3 Failing WW3 Restart File Read in WW3 Feb 9, 2024
@anton-seaice
Copy link
Contributor

I noticed some of the processes were failing here and calling Exitcode 5. I wonder if maybe changing the switches has meant it is looking for ice thickness data in the restart which it wasn't looking for previously?

    ! 1.e Ice thickness interval
    !
    IF ( FLIC1 ) THEN
      IF ( TIC1(1) .GE. 0 ) THEN
        DTI10   = DSEC21 ( TIC1 , TI1 )
      ELSE
        DTI10   = 1.
      END IF
#ifdef W3_T
      WRITE (NDST,9015) DTI10
#endif
      IF ( DTI10 .LT. 0. ) THEN
        IF ( IAPROC .EQ. NAPERR ) WRITE (NDSE,1005)
        CALL EXTCDE ( 5 )
      END IF
    ELSE
      DTI10   = 0.
    END IF
    !

I guess there is something weird going on with TIC1 or TI1

Image              PC                Routine            Line        Source             
access-om3-MOM6-C  00000000049615EB  Unknown               Unknown  Unknown
libpthread-2.28.s  00001479C5AA7CF0  Unknown               Unknown  Unknown
libpthread-2.28.s  00001479C5AA345A  pthread_cond_wait     Unknown  Unknown
libopen-pal.so.40  00001479BFBDC40D  PMIx_Abort            Unknown  Unknown
libopen-pal.so.40  00001479BFC49250  pmix3x_abort          Unknown  Unknown
libopen-rte.so.40  00001479C099EE27  Unknown               Unknown  Unknown
libopen-rte.so.40  00001479C09B6BB6  orte_errmgr_base_     Unknown  Unknown
libopen-rte.so.40  00001479C09A695B  Unknown               Unknown  Unknown
libmpi.so.40.30.4  00001479C655181A  ompi_mpi_abort        Unknown  Unknown
libmpi_mpifh.so    00001479C68842DE  Unknown               Unknown  Unknown
access-om3-MOM6-C  0000000004529637  w3servmd_mp_extcd         865  w3servmd.F90
access-om3-MOM6-C  0000000004548F9C  w3wavemd_mp_w3wav         874  w3wavemd.F90
access-om3-MOM6-C  00000000043AECE1  wav_comp_nuopc_mp        1140  wav_comp_nuopc.F90
access-om3-MOM6-C  000000000200211F  _ZNK5ESMCI13Metho         377  ESMCI_MethodTable.C
access-om3-MOM6-C  0000000002002098  _ZN5ESMCI11Method         563  ESMCI_MethodTable.C
access-om3-MOM6-C  0000000002000B1B  c_esmc_methodtabl         317  ESMCI_MethodTable.C

@ezhilsabareesh8
Copy link
Contributor Author

Thanks for pointing out the error, @anton-seaice. It seems there's a discrepancy between the time stamps of TIC1 and TI1, leading to the error IF ( IAPROC .EQ. NAPERR ) WRITE (NDSE,1005) indicating a mismatch in WAVEWATCH III /' *** WAVEWATCH III ERROR IN W3WAVE :NEW IC1 FIELD BEFORE OLD IC1 FIELD '/.

Upon further investigation, it appears that the restart files lack IC1 (ice thickness) and IC5 (floe diameter), potentially causing this issue. I attempted to rectify this by configuring the extra fields to be written in the restart files using type%restart%extra = 'IC1 IC5' in the ww3_shel.nml file, but the error persists.

Additionally, diagnostics revealed that the ice thickness interval is invalid, as shown by the output:

TEST W3WAVE : DT IC1  =************
TEST W3WAVE : DT IC1  =************

Contrastingly, the ice concentration interval, which is correctly read from the restart file, is TEST W3WAVE : DT ICE = 3600.0.

I'm currently delving deeper into this issue.

@aekiss
Copy link
Contributor

aekiss commented May 1, 2024

@ezhilsabareesh8
Copy link
Contributor Author

Thanks @aekiss and @mvertens. This fix has resolved the failing restart read issue in WW3 when the wave/ice coupling is enabled. I have created a patch for w3iorsmd.f90, will create PR in access-om3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:med ww3 Related to WW3
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants