-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CADS code seg faults in debug mode #775
Comments
@RussTreadon-NOAA Thanks a lot for all your efforts/help on sorting out /resolving those problems.
The GSI in debug mode would become idle after write-out as
When the job was killed due to time limit reached, the error message would show :
|
The following changes made to
Obviously the write statements are for information only. If the above changes are an acceptable way to handle the error condition, the write statements should be removed. |
The changes proposed work fine. I am proposing a different solution if no comments are planned to be added. The current if statement:
could be added to the existing if statement a few lines down and the "imager_info(3,1) >= zero" could be removed.
I ran the ctests on hera. I believe I get similar results. In debug mode, it gets past the read statements for both CrIS and IASI. In regular mode, I get one failure. Test project /scratch1/NCEPDEV/jcsda/Jim.Jung/scrub/ctests_russ/update/build The hafs_4denvar_glbens ran 0.6 seconds too slow. This error is concerning. Zero is a legitimate value for a cluster sizes. In the entries I looked at, the values should be set to the BUFR equivalent of missing. The radiance values should also be missing, not zero. I will be running more tests to determine which data sets the zeros appear, NESDIS operations or a dbnet provider. This error should be fixed upstream. |
Thank you @wx20jjung for this change. It is much cleaner than what I propose. I am testing your change on WCOSS. |
WCOSS2 test
Recompile in debug mode. Only
A zero radiance is being passed to The WCOSS2 failure indicates that we still need to check the radiance being passed into |
Add checks to ensure non-zero radiance is passed to |
I've made several changes to both read_cris.f90 and read_iasi.f90 to wrap if (a positive radiance) then ... @RussTreadon-NOAA would you test these changes on wcoss before I progress further? The two files are in the same place you pulled them before on hera /scratch1/NCEPDEV/jcsda/Jim.Jung/scrub/ctests_russ/update/src/gsi. My ctests on hera were successful. Test project /scratch1/NCEPDEV/jcsda/Jim.Jung/scrub/ctests_russ/update/build 100% tests passed, 0 tests failed out of 6 Total Test time (real) = 2392.47 sec |
@wx20jjung , your updated
Please merge your modified Once all changes have been committed to |
Tagging @CatherineThomas-NOAA for awareness. |
@TingLei-daprediction found that
gsi.x
aborts inread_iasi.f90
andread_cris.f90
with the error messageforrtl: error (73): floating divide by zero
Addition of prints to the code reveals that the radiance for certain elements of the AVHRR cluster can occasionally be
0.0
. CRTM routinecrtm_planck_temperature
converts radiances to brightness temperatures. The radiance value is a divisor in this calculation. Hence thedivide by zero
error.Debug runs of
gsi.x
also found instances in which the cluster size for each of the 7 elements in the cluster is0.0
. The image size summed across all 7 elements is used as a divisor in both the IASI and CrIS read routines. This, too, triggers adivide by zero
error.Logic has been added to a working copy of GSI
develop
at a5e2a43 to address these situations. With this logic in place the debuggsi.x
runs to completion with CADS active.This issue is opened to document the problem and its resolution.
The text was updated successfully, but these errors were encountered: