Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add restart option when the SLM is coupled to MALI #99

Merged

Conversation

hollyhan
Copy link

@hollyhan hollyhan commented Jan 15, 2024

Previously, when the regional sea-level prediction capability was added to MALI (#21), the restart config option for the sea-level model was not added. This led the sea-level model to get initialized to Timestep zero when coupled MALI-SLM simulations are being restarted, forgetting about the ice loading changes and associated viscoelastic solid earth deformation that happened in the timesteps prior to current model time. This PR fixes the problem by allowing the sea-level model to resume where it was left off. Note in parallel to this PR, the version of the SLM needs to incorporate the changes made in the following accompanying PR (MALI-Dev/1DSeaLevelModel_FWTW#9)

Testing results (on Chicoma)
Test directory: /lustre/scratch4/turquoise/hollyhan/testPR_restart_coupled_simulations/landice/slm/circular_icesheet_test/mali20km_slm512

This test utilizes the compass/landice/slm/circicesheet test group (hollyhan:hollyhan/compass/landice/slm_circ_icesheet) in setting a coupled MALI-SLM simulation on MALI 20km rectangular grid with cylindrical ice sheet retreating 2km radius per year and SLM SH512 resolution.

The following config options can be used to setup the circular ice sheet test case on chicoma:

# Config options for slm circ_icesheet test case
[circ_icesheet]

# The size of the domain in meters in the x and y directions for the planar mesh
lx = 6000000.0
ly = 6000000.0

# list of MALI mesh resolutions in kilometers delimited by ',' without space
# i.e. The distance between adjacent cell centers.
mali_res = 20

# Ice shape type ('cylinder', and for dome, 'dome-halfar' or 'dome-cism')
# r0 and h0: initial radius and height of the cylinder ice in meters
ice_type = cylinder
r0 = 2000000.0
h0 = 3000.0

# 'True' if manually want to set bedTopography elevation
# bed topography elevation in meters (provide a positive/negative float value
# for grounded above land/marine-based ice)
set_topo_elev = True
topo_elev = -1500.0

# Whether to center the circular ice in the center of the cell that is closest to the
# center of the domain
put_origin_on_a_cell = False

# Config options for surface mass balance forcing
[smb_forcing]

# Start, end and interval years for SMB forcing
start_year = 2015
end_year = 2116
dt_year = 1

# Direction in which SMB is applied ('horizontal' or 'vertical')
# and amount of ice to melt
direction = horizontal
# Change in radius in meters; used when 'direction == horizontal'
drdt = -2000.0
# Change in height in meters;sed when 'direction == vertical'
dhdt = -20.0

# config options for the sea-level model
[slm]
# True if MALI-SLM are coupled
coupling = True

# List of the number of Gauss-Legendre points in latitude
# delimited by ',' without space
slm_nglv = 512

# Max spherical harmonics degree and order
# i.e. SLM resolution of the SLM
slm_res = 512

# mapping method between the MALI and SLM grids
mapping_method_mali_to_slm = conserve
mapping_method_slm_to_mali = bilinear

# ratio of MALI-SLM coupling interval and MALI output interval in integer years
time_stride = 1

# config options related to visualization
[circ_icesheet_viz]

# Area (m^2) of the global ocean for calculating ice sheet contribution to
# sea level. Only used when MALI-SLM coupling is False
Aocn_const = 4.5007E+14

# Area (m^2) of the global ocean excluding the marine-based ice region
# only used when MALI-SLM coupling is False
AocnBeta_const = 4.50007E+14

# whether to save image files
save_images = True

# whether to hide figures (typically when save_images = True)
hide_figs = True

Version of SLM used: [SHA 286336ddd1ea884f9c397729511853e9f6fd276a added on top of the head of origin/main (SHA 42478dd3c3eefaeeb71d45dc7c8db7f42795519b)]
Temporal resolution of the SMB forcing applied to MALI: 1 yr
Coupling interval between MALI and the SLM: 5 yrs
Simulation start/end year: 2015/2045
Coupled simulation length: 40 yrs
Total number of SLM step in full simulation: 8 (40 yrs divided by 5 yrs)

Test simulations setup
**Simulation 1: Cold-start simulations
Simulation 1-1:Benchmark run where MALI runs from a cold start
Version of MALI used: [SHA 859158a]
(/lustre/scratch4/turquoise/hollyhan/testPR_restart_coupled_simulations/landice/slm/circular_icesheet_test/mali20km_slm512/run_model_restartPR_coldStart/folder_maliHead_3079c717b3/OUTPUT_SLM)
Plot showing the changes in gravitational potential over 40 simulation years:
image

Simulation 1-2: Cold start with the head of this PR [SHA 35bed8a]
(/lustre/scratch4/turquoise/hollyhan/testPR_restart_coupled_simulations/landice/slm/circular_icesheet_test/mali20km_slm512/run_model_restartPR_coldStart/folder_maliHead_35bed8a9e6/OUTPUT_SLM)
Difference in deltaG field (change in gravity potential over 40 years): Calculated by subtracting the result from the benchmark simulation (Simulation 1-1) from the result from this simulation (Sim. 1-2).
image

Simulation 2: Coupled simulation restarts from simulation year 30 (/lustre/scratch4/turquoise/hollyhan/testPR_restart_coupled_simulations/landice/slm/circular_icesheet_test/mali20km_slm512/run_model_restartPR_reStart)

    Simulation 2-1. The namelist option `config_slm_timestep_restart` and the MALI restart year are CORRECTLY set such that the SLM timestep at the restart year match the MALI restart year (note that this creates a constraint on MALI restart year to the years that SLM is called):
                    -	MALI restart year = 2045
                    -	Config_slm_timestep_restart = 6 (SLM restart year = 2015 (start year) +  5 yr(couping interval) * 6 (timestep) = 2045)
       Changes in the gravitational equipotential field between 2015 and 2045 (file `G8.nc` in `/lustre/scratch4/turquoise/hollyhan/testPR_restart_coupled_simulations/landice/slm/circular_icesheet_test/mali20km_slm512/run_model_restartPR_reStart/OUTPUT_SLM_slmRestartYear2045_maliRestartYear2045/G8.nc`):

image
Difference in the fields above to the same field calculated by the benchmark simulation (Simulation #1-1) shows zero difference:
image
(The difference file is located here:/lustre/scratch4/turquoise/hollyhan/testPR_restart_coupled_simulations/landice/slm/circular_icesheet_test/mali20km_slm512/run_model_restartPR_reStart/OUTPUT_SLM_slmRestartYear2045_maliRestartYear2045/diff_G8_to_originDevelop.nc

    Simulation 2-2. The namelist option `config_slm_timestep_restart` and the MALI restart year are INCORRECTLY set to reflect the same year. 
                    -	MALI restart year = 2045
                    -	config_slm_timestep_restart = 5 (SLM restart year = 2015 (start year) +  5 yr(couping interval) * 5 (timestep) = 2040)
                   
      In this case, an error is raised for the inconsistency in restart year in MALI and the SLM. 
      `log.landice.0000.err` file shows, 
----------------------------------------------------------------------
Beginning MPAS-landice Error Log File for task       0 of     128
    Opened at 2024/01/13 17:18:19
----------------------------------------------------------------------

ERROR: Restart year from SLM restart timestep and MALI restart time are inconsistent
CRITICAL ERROR: An error has occurred in li_core_init. Aborting...
Logging complete.  Closing file at 2024/01/13 17:18:19

and log.landice.0000.out file dies with the following message:

 This is a restart: read stream 'restart'.
 Looking for recurring input streams (forcing) that should be forced to be read at the initial time.
   * Forced an initial read of input stream 'smb' from time: 2045-01-01_00:00:00
 Finished processing recurring input streams at initial time.
 Using none velocity solver.
ERROR: Restart year from SLM restart timestep and MALI restart time are inconsistent
 Problem: Restart year based on `config_slm_timestep_restart` is 2040, but MALI restart year is set to 2045.
CRITICAL ERROR: An error has occurred in li_core_init. Aborting...

 -----------------------------------------
 Total log messages printed:
    Output messages =                  324
    Warning messages =                   0
    Error messages =                     1
    Critical error messages =            1
 -----------------------------------------
 Logging complete.  Closing file at 2024/01/13 17:18:19

The log files can be found here: /lustre/scratch4/turquoise/hollyhan/testPR_restart_coupled_simulations/landice/slm/circular_icesheet_test/mali20km_slm512/run_model_restartPR_reStart/output_logfiles_slmRestartYear2040_maliRestartYear2045

@hollyhan hollyhan mentioned this pull request Jan 15, 2024
@hollyhan hollyhan added bug Something isn't working enhancement New feature or request labels Jan 15, 2024
Copy link

@matthewhoffman matthewhoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hollyhan , thanks for adding the restart capability for MALI when running with the SLM and for documenting your testing so thoroughly! I haven't done my own testing yet, but I have a couple questions before I do, so I'm submitting an initial review for now.

! find the right slmTimeStep
call mpas_pool_get_config(liConfigs,'config_slm_coupling_interval', config_slm_coupling_interval)
read(config_slm_coupling_interval(1:4),*) slm_coupling_interval
call mpas_pool_get_config(liConfigs,'config_slm_timestep_restart', config_slm_timestep_restart)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is the user required to manually set config_slm_timestep_restart before doing a restart? What purpose does this serve if the code is able to calculate slmTimeStep_restart on its own? I would prefer to not require an additional namelist change to do a restart with the SLM if it's not necessary. I think it's fair to assume that a user hasn't has not messed around with any output files before a restart.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewhoffman
The purpose is to make sure the restart time of MALI (taken either from config_start_time or restart_timestamp is setup correctly) is set correctly such that the restart year is a correct multiple of the year from config_slm_coupling_interval. Note that slmTimeStep_restart is calculated using currYear, which comes from MALI. This wouldn't be needed if I'd used the mpas alarm functionality and make it automatic in figuring out what the proper restart year should be for the SLM, which could be done in the next PR at some point.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused. If you are checking to make sure these are identical, and then using the value if they are identical, can't we just use the internal calculation and not require the user to set this?

We could instead add a check that the slm coupling interval matches the restart interval. That is the requirement you are worried about, if I understand this correctly.

@hollyhan
Copy link
Author

hollyhan commented Feb 1, 2024

Thanks for the initial review, @matthewhoffman!
As you'll see, I've replied to your comments and also added a new test result showing that this PR works on a cold start case as well. Hope those help!

@matthewhoffman matthewhoffman self-assigned this Feb 20, 2024
This PR updates handling of timesteps between MALI and the SLM in a few ways:
* switch config_slm_coupling_interval to be an integer in years because we only allow
  integer year values
* On init, check that config_adaptive_timestep_force_interval divides evenly
  into config_slm_coupling_interval
* On init, check that restart interval is an even multiple of config_slm_coupling_interval
* On a restart, calculate which SLM time level to use based on the elapsed time from
  the start of the original simulation and config_slm_coupling_interval and make sure
  these divide cleanly
Also add missing =>next pointer assignment to keep code from hanging
Trying to cast intervals into dateTimeStrings did not work.
@matthewhoffman
Copy link

@hollyhan , I've pushed a handful of commits that revamp the logic for checking if the coupling interval is valid and for calculating the SLM time level on a restart without needing the user to do anything. I've tested that when I make things inconsistent it throws the appropriate errors, but it would be great if you could do your more thorough cold start and restart tests. I did confirm that I could get a restart running without error, and I think it is using the correct time level, but I did not compare the restart run results against a full cold start.

@hollyhan
Copy link
Author

@matthewhoffman, thanks for making this PR complete! It makes it a lot easier for the users this way to not have to assign a restart timestep number for the SLM. I've tested the new changes on both cold start and a restart against the full restart using the origin/develop head.

Zero difference in the gravitational field change (G) after 8 SLM timesteps (so over 40 simulation years, with 5 year coupling interval), between the cold start runs from this PR and origin/develop.
image

Restart test: restart year 2040 (simulation start year 2015)
Again, zero difference in the gravitational field change by the end of the simulation.
image

Note that I also checked other fields such as predicted topography, also yield no difference compared to the cold start run.

@@ -121,7 +121,7 @@ subroutine li_bedtopo_init(domain, err)

call mpas_pool_get_config(liConfigs, 'config_uplift_method', config_uplift_method)
if (trim(config_uplift_method)=='sealevelmodel') then
! initialize the 1D sea-level model
! initialize the 1D sea-level model if fresh start

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this change makes sense, because this call is happening whether this is a restart or cold start.

Suggested change
! initialize the 1D sea-level model if fresh start
! initialize the 1D sea-level model

@@ -1197,7 +1198,7 @@ subroutine li_simulation_clock_init(core_clock, configs, ierr)

! Set up the coupling time interval if MALI is coupled to sea-level model
if (trim(config_uplift_method) == "sealevelmodel") then
call mpas_set_timeInterval(slm_coupling_interval, timeString=config_slm_coupling_interval, ierr=err_tmp)
call mpas_set_timeInterval(slm_coupling_interval, YY=config_slm_coupling_interval, ierr=err_tmp)
ierr = ior(ierr,err_tmp)
call mpas_add_clock_alarm(core_clock, 'slmCouplingInterval', alarmTime=startTime, &

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we replace startTime here with the mpas_time_type variable corresponding to the variable simulationStartTime, I think we would be able to do restarts in the middle of coupling intervals. Note startTime here is not simulationStartTime, despite the similar name. simulationStartTime is the start of the original simulation (i.e. the original cold start on a restart), whereas startTime here is the start of this model execution (i.e. the restart time on a restart).

This commit changes how restarts are handled when the SLM is active to
allow MALI to be restarted at any arbitrary restart interval and have
the SLM restart correctly.

This is done by changing the SLM coupling alarm to be based off of the
original simulationStartTime (instead of the start time of the current
execution).  This required moving the creation of the coupling alarm to
later in initialization so that the variable simulationStartTime is
available.  With this change, it was also necessary to change the way
the SLM time level is calculated on a restart to take the floor of the
elapsed time divided by the coupling interval, rather than requiring
that there be no remainder.  This adds a little fragility because there
is no way to double check that is the correct SLM time level, but if
this is set up correctly, it should be handled properly.  Finally, as
part of these changes, I also removed the check on init that the
coupling interval divides evenly into the restart interval, because
that's no longer a requirement.  That's sort of too bad, because it was
a lot of work to figure out how to make that check!  But it's nicer to
not have that restriction.
This doesn't serve any internal purpose, but it could help a user detect an error
in their configuration.
err = ior(err, err_tmp)
if (err == 0) then
call mpas_log_write("Calling the SLM. SLM timestep $i", intArgs=(/slmTimeStep/))
call slmodel_solve(slmTimeStep, domain)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hollyhan , do you know if this call to slmodel_solve on init on a restart is necessary? The time level on init is a time level that was already fully run on the previous run, so I don't think we need to run the SLM again. It wouldn't hurt, but I think it can be skipped. Unless you are aware of a reason it is necessary.

To get restarts to work when using restart intervals shorter than the coupling interval, this call would need to be deleted or only made if the restart time happened to coincide with the coupling interval.

It's not needed, and if the restart time is not a coupling interval, it
will make the SLM get out of sync.
@matthewhoffman
Copy link

matthewhoffman commented Feb 28, 2024

Testing

I ran a circular ice-sheet test that @hollyhan set up with and without restarts. It has restart output at 1 yr intervals and a SLM coupling interval of 5 yr. Simulation starts in 2015 and ends in 2055. Three experiments were done:

  1. full simulation
  2. restart in 2030 (even coupling interval)
  3. restart in 2023 (mid-coupling interval)

This configuration has a prescribed ice thickness time series, but bedTopography is updated from the SLM, so the VAF globalStat will be affected by the SLM solutions. This plot show the final few annual time levels of all three runs. Marker styles were chosen to ensure the values for all three runs could be seen. All three runs have the same results, confirming that restarts are working correctly. I also confirmed with ncdump that the values all match to machine precision.
image

Model runs are on Chicoma at /lustre/scratch4/turquoise/mhoffman/pr99-slm-restart-testing.

@hollyhan
Copy link
Author

@matthewhoffman , thanks for the new test result! I've also confirmed the changes work for me as well.

@matthewhoffman
Copy link

compass full_integration suite passes

@matthewhoffman matthewhoffman merged commit f79661c into MALI-Dev:develop Mar 1, 2024
hollyhan added a commit to hollyhan/compass that referenced this pull request Apr 15, 2024
Update the namelist file to reflect the changes made in the recent PR (MALI-Dev/E3SM#99)
Also change the simulation end year to 2031 as required by ISMIP6-2300
hollyhan added a commit to hollyhan/compass that referenced this pull request Apr 15, 2024
Update the namelist file to reflect the changes made in the recent PR (MALI-Dev/E3SM#99)
Also change the simulation end year to 2031 as required by ISMIP6-2300
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants