-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[develop] Make get_obs
tasks day-dependent in workflow; other improvements and bug fixes
#1137
[develop] Make get_obs
tasks day-dependent in workflow; other improvements and bug fixes
#1137
Conversation
…the tar file where the prepbufr files live changed"
…y Michelle Harrold, solution by Michael Kavulich.
…ntStat tasks' METplus log files.
…ing cycles for CCPA and MRMS but not yet for NDAS or NOHRSC.
…thout performing unnecessary repeated pulls.
… they're per-cycle or per-day.
…nup and comments.
…files from HPSS (and works with multiple cycles).
…e cleanup is happening.
…les, that are expected to be created once the task is finished actually get created. This is needed because it is possible that for some forecast hours for which there is overlap between cycles, the files are being retrieved and processed by the get_obs_... task for another cycle.
…nd EnsembleStat tasks such that GenEnsProd does not depend on the completion of get_obs_... tasks (because it doesn't need observations) but only forecast output while EnsembleStat does.
…d due to changes to dependencies of GenEnsProd tasks in previous commit(s).
…tending to time out for 48-hour forecasts.
…sure PcpCombine operates only on those hours unique to the cycle, i.e. for those times starting from the initial time of the cycle to just before the initial time of the next cycle. For the PcpCombine_obs task for the last cycle, allow it to operate on all hours of that cycle's forecast. This ensures that the PcpCombine tasks for the various cycles do not clobber each other's output. Accordingly, change the dependencies of downstream tasks that depend on PcpCombine obs output to make sure they include all PcpCombine_obs tasks that cover the forecast period of the that downstream task's cycle.
…ossibly also get_obs_ndas by putting in sleep commands.
Co-authored-by: Gillian Petro <[email protected]>
Co-authored-by: Gillian Petro <[email protected]>
…asks' into feature/daily_obs_tasks
@gsketefian I noticed that the tech doc test is failing. You can follow the instructions in the slides here or in the documentation to update. I'm hoping it will be pretty straightforward now that I've put together the instructions, but let me know if you need help. |
The verification WE2E tests have successfully passed on Hera Intel:
as well as the verification tests on Hera GNU:
|
…for each new module was added in a previous commit).
… of new python functions; add type for each argument and return value; use latex-style math formatting for equations; other minor formatting adjustments.
@gspetro-NOAA I fixed up as much of the docs as I could, but I ran into one issue that I don't have a good fix for. The problem is that the scripts
Then I thought to define
I decided to stop there and see if you have any ideas. I'll tag @mkavulich too since he's the original author of the use of |
@MichaelLueken FYI that I made that change of the wallclock time in |
@gsketefian and @gspetro-NOAA - Unfortunately, it looks like the modification to
Additionally, the technical documentation for the above three scripts in The modification made to In the new python scripts, I see:
Since there is now a dependency on It looks like |
@MichaelLueken @gsketefian Is there a way to put the try/except statement under
get executed. This causes an error precisely because the |
Your linter might complain about an Alternatively, with |
Thanks, @gspetro-NOAA and @maddenp-noaa, for options to try to get the technical documentation to work! I think the main issue here is that it is impossible to set As @gspetro-NOAA noted, adding In
This is how I was able to get around the issue with |
…ile for sphinx (conf.py) (this will give sphinx access to METplus); remove definition of METPLUS_ROOT as an environment variable from the Makefile for the docs and instead define it in conf.py (as just a null string since it isn't actually used to load METplus).
@MichaelLueken @gspetro I followed your suggestion of adding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing the documentation issues! The Test Docs
GHA test is passing now and the technical documentation for the new ush/
python scripts are being generated in RtD!
Since the new verification WE2E tests are successfully running on Hera GNU and Hera Intel, I will approve these changes and start the automated testing.
Co-authored-by: Gillian Petro <[email protected]>
The rerun of the Jenkins tests on Derecho have successfully completed:
The rest of the tests successfully passed on Friday evening. Since the last conversations have been resolved, I will now move forward with merging this PR. |
@MichaelLueken Thanks for shepherding this through! |
DESCRIPTION OF CHANGES:
This PR fixes multiple bugs in the verification (vx) and other parts of the SRW App, the main one being that the
get_obs
tasks as well as some of the vx pre-processing tasks currently do not work for an experiment with multiple cycles if those cycles overlap in time (bug discovered by @michelleharrold and @willmayfield). Fixes and changes made by this PR are described in more detail below.Changes related to
get_obs
tasks:Make
get_obs
tasks in the ROCOTO workflow obs-day-based as opposed to cycle-based. Thus, for each day for which obs are needed for vx (and for each obs type that is needed for vx), there is now aget_obs
workflow task.Move the functionality in the ex-script
exregional_get_verif_obs.sh
to the new python scriptget_obs.py
. The newexregional_get_verif_obs.sh
is now a very short script that just callsget_obs.py
.The new
get_obs.py
script, along with changes tosetup.py
to calculate the times at which various types of obs need to be retrieved, ensure that no clobbering of retrieved obs files occurs (this currently does occur if cycles overlap).In
config_defaults.yaml
, introduce new variables specifying the obs availability interval for each of the four obs types (CCPA, NOHRSC, MRMS, NDAS) that might be retrieved. These variables are[CCPA|NOHRSC|MRMS|NDAS]_OBS_AVAIL_INTVL_HRS
.setup.py
now checks that multiple consistency constraints and requirements on the temporal vx parameters in the SRW configuration file (e.g. the accumulation periods, the obs availability intervals, the forecast output interval) are satisfied that would otherwise cause errors in the workflow. (setup.py
calls functions in the (renamed) scriptset_cycle_and_obs_timeinfo.py
to run these checks.) If such inconsistencies exist, the parameters are either adjusted to fix them or, if that is not possible, the experiment generation process is stopped.In
config_defaults.yaml
, introduce flags that determine whether or not to delete the raw obs directories and files that theget_obs
tasks create after the raw obs have been copied/moved/renamed to their final/processed locations. These new flags areREMOVE_RAW_OBS_[CCPA|NOHRSC|MRMS|NDAS]
.In
config_defaults.yaml
, move the base directories for the obs, i.e.[CCPA|NOHRSC|MRMS|NDAS]_OBS_DIR
, from theplatform
section to theverification
section so that they are near the METplus obs file name templates (OBS_...FN_TEMPLATE
) for which they serve as base directories.The processed/final files that the
get_obs
tasks create are now located and named as specified by the combination of the obs base directory (e.g.CCPA_OBS_DIR
) and the obs file name template (e.g.CCPA_APCP_FN_TEMPLATE
). Currently, the processed/final file that theget_obs
tasks first look for are, say for CCPA,{CCPA_OBS_DIR}/{CCPA_APCP_FN_TEMPLATE}
, but if these files don't exist and the obs need to be retrieved, the retrieved and processed file names are not necessarily given by this template. With this PR, the raw files are renamed and moved after retrieval to ensure they are located at{..._OBS_DIR}/{..._FN_TEMPLATE}
.Retrieve only 6-hourly NOHRSC snow accumulation obs, not 24-hourly accumulations. Currently, 24-hour accumulated obs are also retrieved (although there doesn't seem to be a WE2E test for it).
Modify the configuration file
parm/data_locations.yml
forretrieve_data.py
to extract all files in an archive at a time (i.e. per call toretrieve_data.py
) instead of extracting only one obs file out of an archive for each call toretrieve_data.py
. This speeds up the data retrieval significantly since a large portion of theget_obs
tasks' wallclock time is spent establishing a connection to HPSS.Modify
parm/data_locations.yml
to account for the change in prebpufr (NDAS) obs file names on May 22, 2024. This is currently causingget_obs_ndas
tasks to fail for cycles at or after this date. (Bug found by @michelleharrold.)Fix vx task dependencies to work with new obs-day-based
get_obs
tasks. Now, allget_obs
tasks (i.e. for all obs days) for a given obs type must be complete before any vx tasks for that obs type can launch. This doesn't cause any significant delay because theget_obs
tasks run in parallel and get at most one day's worth of obs.Changes related to vx pre-processing tasks (
PcpCombine_obs
andPb2nc_obs
):Add
PcpCombine_obs
tasks for both 6-hour and 24-hour accumulations of NOHRSC obs. The one for 6-hour accumulation simply converts the grib2 obs files to NetCDF, while the one for 24-hour accumulation adds the 6-hour grib2 obs to obtain a NetCDF file for 24-hour obs accumulations.Place all output from
PcpCombine_obs
tasks (both for CCPA and NOHRSC) under the cycle directories, just as is done for the analogousPcpCombine_fcst
tasks for forecasts. This is because accumulations, even for obs, depend on the start time of the cycle, e.g. 6-hour CCPA accumulations needed to verify a set of forecasts that start at 00Z will be different than 6-hour CCPA accumulations needed to verify a set of forecasts that start at 03Z. (Currently, the output files from these tasks are placed in themetprd
directory under the main experiment directory without consideration for the start times of the accumulations.)Make the
Pb2nc_obs
task for NDAS obs-day-dependent (unlike thePcpCombine_obs
tasks, which are cycle-dependent). This can be done because unlike accumulations, the result of thePb2nc_obs
task does not depend on the starting time of the cycles; it only depends on a given valid time. Also, keep the output of thePb2nc_obs
task in the cycle-independent directorymetprd
directly under the main experiment directory.Small, self-contained bug fixes and improvements:
Move evaluation of METplus time strings out of what used to be
set_vx_fhr_list.sh
(now renamed toset_leadhrs.sh
) and into a new bash script (bash_utils/eval_METplus_timestr_tmpl.sh
) to make it easier to change this functionality to python later on.Allow WE2E test names to include dots since dots (like underscores) are handy to use as separators in the test name.
Add the two new SRW config parameters
VX_CONFIG_[DET|ENS]_FN
inconfig_defaults.yaml
that specify the yaml configuration files to use for deterministic and ensemble verification. The default values for these are the filesvx_config_[det|ens]_fn.yaml
inparm/metplus
. These parameters allow a user to specify other user-created yaml files in this directory to use for the vx configuration so that the default files, which are under version control, do not have to be changed.Change some metatask and task names for clarity and consistency.
Add an option to
mrms_pull_topofhour.py
to not assume that there is a valid-date subdirectory under the specified source directory and to not add such a subdirectory under the specified output directory when generating output. This is handy when calling this script from the newget_obs.py
script.Fix bug in
parm/wflow/verify_det.yaml
so that all tasks have acycldefs
statement by default. This bug was causingGridStat
workflow tasks forCCPA
andNOHRSC
obs to be created for cycles not defined for the workflow (these extraneous cycles probably correspond to the default set of cycles that a task gets assigned by ROCOTO when it does not contain an explicitcycledefs
statement). (Bug found by @michelleharrold, solution by @mkavulich.)Fix bug in
scripts/exregional_run_met_gridstat_or_pointstat.sh
to append a string for the cycle date ("_YYYYMDDHH") to the name of the metplus log file for deterministicGridStat
andPointStat
tasks. This was causing the metplus log file for GridStat for a given cycle tasks to be overwritten by those for other cycles. (Bug found by @michelleharrold.)Fix bug in
parm/default_workflow.yaml
in "cycled_from_second" section in which the starting YYYYMMDDHH value of the cycledef can contain an HH value that is larger than 23. This currently happens because this HH is obtained directly fromINCR_CYCL_FREQ
without checking whether that value is less than 24.Fix bug In
launch_FV3LAM_wflow.sh
to change double quotes to single quotes to prevent failure in the interpretation of the command by cron.New WE2E tests added:
The following new WE2E tests were added to the
verification
subdirectory undertest_configs
to test various aspects of the new code:get_obs_hpss.do_vx_det.multicyc.cycintvl_07hr_inits_vary_fcstlen_09hr.ncep-hrrr
get_obs_hpss.do_vx_det.multicyc.cycintvl_11hr_inits_vary_fcstlen_03hr.ncep-hrrr
get_obs_hpss.do_vx_det.multicyc.cycintvl_24hr_inits_00z_fcstlen_03hr.ncep-hrrr
get_obs_hpss.do_vx_det.multicyc.cycintvl_24hr_inits_12z_fcstlen_03hr.nssl-mpas
get_obs_hpss.do_vx_det.multicyc.cycintvl_24hr_inits_12z_fcstlen_48hr.nssl-mpas
get_obs_hpss.do_vx_det.multicyc.cycintvl_24hr_inits_21z_fcstlen_03hr.ncep-hrrr
get_obs_hpss.do_vx_det.multicyc.cycintvl_96hr_inits_12z_fcstlen_48hr.nssl-mpas
get_obs_hpss.do_vx_det.singlecyc.init_00z_fcstlen_36hr.winter_wx.SRW
The purpose of each of these is described in the
description
section of the corresponding test config file.Type of change
TESTS CONDUCTED:
Three sets of WE2E tests were conducted on Hera/intel:
The "fundamental" suite consisting of:
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_HRRR
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_RAP_suite_WoFS_v0
The existing verification tests, consisting of:
MET_ensemble_verification_only_vx
MET_ensemble_verification_only_vx_time_lag
MET_ensemble_verification_winter_wx
MET_verification
MET_verification_only_vx
MET_verification_winter_wx
The newly added get_obs/verification tests, consisting of:
get_obs_hpss.do_vx_det.multicyc.cycintvl_07hr_inits_vary_fcstlen_09hr.ncep-hrrr
get_obs_hpss.do_vx_det.multicyc.cycintvl_11hr_inits_vary_fcstlen_03hr.ncep-hrrr
get_obs_hpss.do_vx_det.multicyc.cycintvl_24hr_inits_00z_fcstlen_03hr.ncep-hrrr
get_obs_hpss.do_vx_det.multicyc.cycintvl_24hr_inits_12z_fcstlen_03hr.nssl-mpas
get_obs_hpss.do_vx_det.multicyc.cycintvl_24hr_inits_12z_fcstlen_48hr.nssl-mpas
get_obs_hpss.do_vx_det.multicyc.cycintvl_24hr_inits_21z_fcstlen_03hr.ncep-hrrr
get_obs_hpss.do_vx_det.multicyc.cycintvl_96hr_inits_12z_fcstlen_48hr.nssl-mpas
get_obs_hpss.do_vx_det.singlecyc.init_00z_fcstlen_36hr.winter_wx.SRW
All tests were successful.
DOCUMENTATION:
I am not familiar with the new RST file setup and will need help moving my documentation from comments in the code to the RST files.
CHECKLIST
Possibly; I'm not sure what exactly the documentation requirements are currently.
LABELS (optional):
A Code Manager needs to add the following labels to this PR:
CONTRIBUTORS (optional):
@michelleharrold @mkavulich @JeffBeck-NOAA @willmayfield