-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[develop] Updates to devclean.sh script and plotting scripts and tasks #1100
[develop] Updates to devclean.sh script and plotting scripts and tasks #1100
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noted a small grammatical issue in one of the devclean.sh
comments.
Please make sure to remove the changes to devclean.sh
, parm/wflow_plot.yaml
, scripts/exregional_plot_allvars.py
, and scripts/exregional_plot_allvars_diff.py
from PR #1089.
You can use the following command to back out the changes to these files in the rrfs_ics_lbcs
branch:
git checkout 28cbbc8 devclean.sh parm/wflow/plot.yaml scripts/exregional_plot_allvars.py scripts/exregional_plot_allvars_diff.py
This command will checkout the version of the four files at the 28cbbc8
hash of your branch, the hash before you began applying changes related to the PR.
Co-authored-by: Michael Lueken <[email protected]>
* Adds logic to handle GCP's default conda env, which conflicts with the SRW App's conda env. Fixes a Parallel Works naming convention bug in the script. * It also addresses a known issue with a Ruby warning on PW instances that prevents the run_WE2E_tests.py from exiting gracefully. The solution we use in our bootstrap for /contrib doesn't seem to work for the /lustre directory, which is why the warning is hardcoded into the monitor_jobs.py script. * The new spack-stack build on Azure is missing a gnu library, so added the path to this missing library to the proper run scripts and cleaned up the wflow noaacloud lua file. * Removed log and error files from the qsub wrapper script so that qsub can generate these files with the job id in the files name. Also, fixed typo in the wrapper script.
As part of the data governance initiative, all s3 buckets will need some sort of versioning control. To meet these needs the AWS S3 Bucket was reorganized with the develop data stored under a 'develop-date' folder and the verification sample case and the document case (current_release_data) moved under a new folder called 'experiment-user-cases'. --------- Co-authored-by: Michael Lueken <[email protected]>
…s tested in PULL_REQUEST_TEMPLATE (ufs-community#1096) * Update ufs-weather-model hash to b5a1976 (July 30) * Add hera.gnu, remove cheyenne.intel, cheyenne.gnu, and gaeac5.intel, and alphabetize the machines in the TESTS CONDUCTED section of the PULL_REQUEST_TEMPLATE * Correct behavior of Jenkins Functional WorkflowTaskTests. Currently, TASK_DEPTH is set to null, resulting in no tests being run during the Functional WorkflowTaskTests stage. Replaced env with params in Jenkinsfile for setting TASK_DEPTH. Testing shows that this will correctly set TASK_DEPTH to the default value of 9 and allow the tests to run * Removed extraneous entries from the verification scripts to remove KeyError messages in the associated verification log files * Reapplied necessary modification to modulefiles/tasks/noaacloud/plot_allvars.local.lua to allow plotting tasks to run on NOAA cloud platforms
- Format, remove old comment - Only remove conda if the location in conda_loc is the same as the default conda install
- Default case (no flags provided) now does nothing except print usage and exit - Include "--build" flag to remove build artifacts - Include short aliases for all the removal options
Updates and simplifications of the devclean.sh script
Safety and simplification updates for devclean.sh
All pull request by @mkavulich has been merged natalie-perlin#12, and all the questions resolved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While running the fundamental WE2E tests on Hera, the grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot
test failed. In PR #1098, the default LOAD_MODULES_RUN_TASK_FP
was changed to LOAD_MODULES_RUN_TASK
. When this change was applied to parm/wflow/plot.yaml
in your scripts_plots_updates
branch, the test successfully ran. Please make this modification in your parm/wflow/plot.yaml
file.
Additionally, please convert one of the currently existing ensemble WE2E tests to exercise the plotting capability that is being added in this PR. Adding a new feature without adding the ability to test it is bad practice.
Co-authored-by: Michael Lueken <[email protected]>
…NAM_lbcs_NAM_suite_GFS_v16 test
@mkavulich @MichaelLueken - It is however stuck in the queue, and shown as "PENDING" for few hours: [Natalie.Perlin@Hera:/scratch2/NCEPDEV/stmp1/Natalie.Perlin/SRW/ufs-srweather-app]$ squeue -u Natalie.Perlin Not sure if there is an issue of running jobs on Hera. I could try using a different platform that may have a lighter compute load and may complete test faster... let me know! Attached is a snapshot of the built documentation showing the update made for devclean.sh script. |
@MichaelLueken - the test grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16 with plots for each ensemble has completed successfully on Hera, in /scratch2/NCEPDEV/stmp1/Natalie.Perlin/SRW/expt_dirs/grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the requested changes! I have successfully run the fundamental WE2E tests as well:
----------------------------------------------------------------------------------------------------
Experiment name | Status | Core hours used
----------------------------------------------------------------------------------------------------
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta_2 COMPLETE 11.70
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2_20240 COMPLETE 8.55
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot COMPLETE 26.05
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_HRRR_2024081 COMPLETE 46.78
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_RAP_suite_WoFS_v0_20240819195 COMPLETE 27.05
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16_2024081919530 COMPLETE 48.16
----------------------------------------------------------------------------------------------------
Total COMPLETE 168.29
and the plots for the ensemble test successfully completed as well:
plot_allvars_mem001_f000_202105121200 SUCCEEDED 205.0 1.37
plot_allvars_mem001_f001_202105121200 SUCCEEDED 204.0 1.36
plot_allvars_mem001_f002_202105121200 SUCCEEDED 204.0 1.36
plot_allvars_mem001_f003_202105121200 SUCCEEDED 205.0 1.37
plot_allvars_mem001_f004_202105121200 SUCCEEDED 204.0 1.36
plot_allvars_mem001_f005_202105121200 SUCCEEDED 205.0 1.37
plot_allvars_mem001_f006_202105121200 SUCCEEDED 205.0 1.37
plot_allvars_mem002_f000_202105121200 SUCCEEDED 205.0 1.37
plot_allvars_mem002_f001_202105121200 SUCCEEDED 204.0 1.36
plot_allvars_mem002_f002_202105121200 SUCCEEDED 204.0 1.36
plot_allvars_mem002_f003_202105121200 SUCCEEDED 204.0 1.36
plot_allvars_mem002_f004_202105121200 SUCCEEDED 206.0 1.37
plot_allvars_mem002_f005_202105121200 SUCCEEDED 205.0 1.37
plot_allvars_mem002_f006_202105121200 SUCCEEDED 205.0 1.37
It would be a good idea to modify the description for the config.grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16.yaml
configuration to include that it also tests the ability to produce graphics for each ensemble member.
Since the fundamental WE2E tests are all running successfully (with the addition of adding plotting to the ensemble test in the fundamental suite), I will re-approve this work now.
@MichaelLueken - where the description of the test config.grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16.yaml |
@natalie-perlin - If you go directly to
It would be a good idea to include that either plots are produced or graphics are created for each ensemble member in the description section above. |
…NAM_suite_GFS_v16.yaml
@MichaelLueken @mkavulich - updated the test description to the following:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@natalie-perlin Thanks for the productive discussion and back-and-forth, I'm glad we were able to reach a compromise
The Jenkins tests successfully passed on all machines, but the test stage was ultimately terminated for running past the 8 hour limit on Jet. The WE2E coverage tests were manually ran on Jet and all tests successfully passed:
Moving forward with merging this PR now. |
DESCRIPTION OF CHANGES:
./devclean.sh script that cleans SRW builds is updated, all the cleaning tasks are done for the directories under the main SRW tree
Documentation updated for the devclean.sh script changes
plotting scripts updated to have geographical data visible over the colored fields
plotting task updated to allow graphics output for individual ensemble members
use python3 to checkout external sub-modules in a checkout_externals script; python3 is a default for other scripts; some systems such as MacOS no longer come with python2
Type of change
TESTS CONDUCTED:
Tested various combinations of cleaning up tasks on MacOS
DEPENDENCIES:
natalie-perlin#12 (Merged)
DOCUMENTATION:
ISSUE:
devclean.sh script update addresses the issue in #1073 (devclean.sh demonstrates very unsafe directory removal behavior)
CHECKLIST
LABELS (optional):
A Code Manager needs to add the following labels to this PR:
CONTRIBUTORS (optional):
@mkavulich