Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Sounding observation counts discrepancy between JEDI and GSI #233

Closed
delippi opened this issue Nov 22, 2024 · 12 comments
Closed

[bug] Sounding observation counts discrepancy between JEDI and GSI #233

delippi opened this issue Nov 22, 2024 · 12 comments
Assignees

Comments

@delippi
Copy link
Collaborator

delippi commented Nov 22, 2024

Current behavior (describe the bug)

When processing ADPUPA (120/220) sounding data in JEDI, the observation counts appear significantly lower compared to GSI. Specifically, in the RRFS FV3-JEDI ctest case (2022-05-26T19:00:00Z):

  • GSI observation count: 236
  • JEDI observation count: 96

With all QC filters turned off, the JEDI log has the following information:

 0: QC apdupa_airTemperature_120 airTemperature: 247 missing values.
 0: QC apdupa_airTemperature_120 airTemperature: 96 passed out of 343 observations.

We are not sure if this is a problem during the bufr2ioda conversion, in the JEDI configuration, or in JEDI itself.

Steps to Reproduce (if applicable)

What computer are you running on?

Hera

Steps to reproduce the behavior

  1. Copy phase 2 workspace: /scratch2/NCEPDEV/fv3-cam/Donald.E.Lippi/RRFSv2/jedi-assim-phase2. The only parts needed are as follows
  • run_all.sh
  • rrfs-data_fv3jedi_2022052619/
  • gsi_2022052619/
  1. Run run_all.sh
  • RUN_GSI, RUN_JEDI, MAKE_PLOT, use_offline_domain_check should all be "YES". After running GSI the first time set RUN_GSI="NO".
  • change paths for jedi_dir, gsi_dir to where ever you copied the rrfs and gsi case directories from step 1.
  • obtype_configs="$obtype_configs adpupa_airTemperature_120.yaml" should be the only uncommented obstype_config
  • Other paths should be okay.
  • gsi_2022052619/run_gsi.sh will use my GSI build (unless changed)
  • rrfs-data_fv3jedi_2022052619/run_fv3jedi.sh will use my JEDI build (unless changed)
  • bash ./run_all.sh
  • check the ./rrfs-data_fv3jedi_2022052619/conv.yaml (template yamls are in the valid_yamls path in ./run_all.sh)

Expected behavior

Observation counts between JEDI and GSI should be more similar.

Suggested Solution (if known)

Unknown at this point.

Acceptance Criteria (Definition of Done)

  • Link any relevant pull requests here:
    • PR # (will be added when a solution is found)

Dependencies

RDASApp Issue #232
HDASApp Issue NOAA-EMC/HDASApp#16

Additional information (optional)

IODA re-processing data is done: /scratch2/NCEPDEV/fv3-cam/Donald.E.Lippi/RRFSv2/ioda_processing
Relevant files include:

# The full obs file:
./ioda/ioda_adpupa.nc

# The offline domain check (dc) file:
./ioda/ioda_adpupa_dc.png
./ioda/ioda_adpupa_dc.nc

# yaml used for converter
./yaml/prepbufr_adpupa.yaml
@delippi
Copy link
Collaborator Author

delippi commented Nov 25, 2024

I think I've found the problem. GSI uses nhr_assimilation=3, nhr_obsbin=3 assuming one observation bin spanning the entire 3-h period. In my JEDI DA yaml I have a shorter time window:

cost function:
  cost type: 3D-Var
  time window:
      begin: 2022-05-26T18:00:00Z
      length: PT2H

I believe that JEDI is tossing any observation with dateTime (not using timeOffset) that is outside the shorter JEDI window. I tested this hypothesis by changing all the dateTime values to be equal to the analysis time (in this case dateTime=1653591600). I now get an observation count of 188 obs. This matches my expected observation counts which I calculated by reading in the IODA observation and filtering out all observations that don't match the following criteria:

  1. ObsType==120
  2. airTemperature values were not invalid or already masked

I also checked this by add counting the number of observations with a dateTime corresponding to a timeOffset=-3600 or greater. The ob count was 96 matching the counts I was seeing to start with.

@ShunLiu-NOAA @TingLei-NOAA @SamuelDegelia-NOAA @guoqing-noaa @JingCheng-NOAA @hu5970 We should discuss what the correct time window settings should be for JEDI system.

@ShunLiu-NOAA
Copy link

@delippi Thanks for this finding. With a proper dateTime or other YAML configurations, is it possible that JEDI ingests the same amount of observations as GSI?

@delippi
Copy link
Collaborator Author

delippi commented Nov 25, 2024

@ShunLiu-NOAA, I'm looking into this. I think there is still something else that I'm missing. I still expect them be able to get the exact ob counts... at least I don't see why they shouldn't!

@delippi
Copy link
Collaborator Author

delippi commented Nov 25, 2024

@ShunLiu-NOAA, I think we can get the same ob counts. I've just done a test where I change the convinfo time window value to 0.5 (instead of 1.5) and adjust the YAML time window filter to do:

         # Time window filter
         - filter: Domain Check
           apply at iterations: 0,1
           where:
             - variable:
                 name: MetaData/timeOffset # units: s
               minvalue: -1800
               maxvalue:  1800

I was able to get an exact match (36 obs). I was able to get an exact match when using convinfo time window = 0.9 and +/-3240 in JEDI (82 obs).

I'm not sure why I get mismatching results again when I change to using convinfo time window = 1.0 and +/-3600 in JEDI... (96 vs 236 obs). Based on the previous results, 96 seems like the correct obs when using a time window of 1 hour.

@JingCheng-NOAA
Copy link

@delippi Have you checked the distribution of the obs? I found some of my observations are near or close to the domain boundary, and since there are slightly difference in GSI and JEDI analysis grid, it might be OK to have a slightly difference in obs number.

@SamuelDegelia-NOAA
Copy link
Contributor

Is the timeOffset filter performed after the initial thresholding set under the time window header? If so, does that mean we always need to make sure that the latter is larger or equal to the timeOffset checks? If so, this could be sort of a parallel to setting time_window_max in gsiparm.anl .

@delippi
Copy link
Collaborator Author

delippi commented Nov 26, 2024

@delippi Have you checked the distribution of the obs? I found some of my observations are near or close to the domain boundary, and since there are slightly difference in GSI and JEDI analysis grid, it might be OK to have a slightly difference in obs number.

For my case, these are soundings from about 18Z assimilated in a 19Z cycle so there are not many. There are actually only 4 profiles and they are not near the boundary. Here is the plot after offline domain check for reference of what it looks like (ignore that is says MPAS domain; that one is just used by default and is a little larger than the FV3 domain).
ioda_adpupa_dc

@delippi
Copy link
Collaborator Author

delippi commented Nov 26, 2024

Is the timeOffset filter performed after the initial thresholding set under the time window header? If so, does that mean we always need to make sure that the latter is larger or equal to the timeOffset checks? If so, this could be sort of a parallel to setting time_window_max in gsiparm.anl .

I'm not sure if it matters if the latter occurs before or after the timeOffset; either way it will clip the obs to 1 h how it is currently set. But yes, I think you are right that we always need to make sure that the latter is larger or equal to the timeOffset checks.

@TingLei-NOAA
Copy link
Contributor

@delippi "I'm not sure why I get mismatching results again when I change to using convinfo time window = 1.0 and +/-3600 in JEDI... (96 vs 236 obs). Based on the previous results, 96 seems like the correct obs when using a time window of 1 hour." I think when we talk about data assimilation window (as PT2H in jedi), for gsi, we should see how nhr_assimilation is set.

@delippi
Copy link
Collaborator Author

delippi commented Nov 26, 2024

@delippi "I'm not sure why I get mismatching results again when I change to using convinfo time window = 1.0 and +/-3600 in JEDI... (96 vs 236 obs). Based on the previous results, 96 seems like the correct obs when using a time window of 1 hour." I think when we talk about data assimilation window (as PT2H in jedi), for gsi, we should see how nhr_assimilation is set.

These are the relevant settings in GSI. Not sure if there are any more to pay attention to.

 &SETUP
   nhr_assimilation=3,
   nhr_obsbin=3,
/
 &OBS_INPUT
   time_window_max=1.5,
/

There is also

 &OBS_INPUT
   ext_sonde=.true.,
/

When I turn off ext_sonde, I can match ob counts when setting convinfo time window to 1.0 0.99. I guess it is the "logical for extended forward model on sonde data", but I will need to learn more about this option to see exactly what it doing and if a functionality is needed to be added to JEDI or if it is just a matter of increasing the equivalent time_window_max in JEDI.

@delippi
Copy link
Collaborator Author

delippi commented Nov 27, 2024

I have to turn off ext_sonde and set time_window_max=0.99 to match obs. I'm not sure why using time_window_max=1.0 still gives the unexpected ob counts... by a lot.

We should turn off ext_sonde since that doesn't make sense to be using. It is currently turned on in the RRFS retros and likely real-time. @SamuelDegelia-NOAA It is likely too late for RRFSv1 but for setting up our new FV3/GSI case I think we ought to make this change.

@SamuelDegelia-NOAA
Copy link
Contributor

Understood, thanks for finding this! I'll make the change for our new FV3 case.

@delippi delippi closed this as completed Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants