Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RRFS-SD reader #154

Draft
wants to merge 10 commits into
base: develop
Choose a base branch
from

Conversation

bbakernoaa
Copy link
Member

create a separate reader for RRFS-SD. This is based on _rrfs_cmaq_mm but removes many of the functions as rrfs-sd is obviously much simpler.

@bbakernoaa bbakernoaa added the enhancement New feature or request label Dec 14, 2023
@bbakernoaa
Copy link
Member Author

this addresses #152

@bbakernoaa bbakernoaa linked an issue Dec 14, 2023 that may be closed by this pull request
Copy link
Member

@zmoon zmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just few initial comments. I'll work on the pressure calculation once I have extracted some sample data.

monetio/models/_rrfs_sd_mm.py Outdated Show resolved Hide resolved
monetio/models/_rrfs_sd_mm.py Outdated Show resolved Hide resolved
monetio/models/_rrfs_sd_mm.py Outdated Show resolved Hide resolved
monetio/models/_rrfs_sd_mm.py Outdated Show resolved Hide resolved
monetio/models/_rrfs_sd_mm.py Outdated Show resolved Hide resolved
if "ug/kg" in dset[i].attrs["units"]:
# ug/kg -> ug/m3 using dry air density
dset[i] = dset[i] * dset["pres_pa_mid"] / dset["temperature_k"] / 287.05535
dset[i].attrs["units"] = r"$\mu g m^{-3}$"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we really want a pretty version here could use unicode, e.g. μg/m³ or μg m⁻³. But maybe better to use just ASCII ug/m3 or ug m-3.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s stick with ascii. It may cause issues down the pipeline if we save the raw files out. It might not be recognized as CF convention otherwise

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I think ug m-3 best for CF (though they would want kg m-3 for canonical).




# convert "ug/kg to ug/m3"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this to be optional like Jordan's?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont see why

dset['pm25_sd'] = dset.smoke + dset.dust
dset['pm25_sd'].attrs['long_name'] = 'Particulate Matter < 2.5 microns'
dset['pm25_sd'].attrs['units'] = "ug/kg"
if {'smoke', 'dust', 'coarsepm'} <= dset.data_vars.keys():
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't know about this syntax. Really clean and useful.

phalf(k) = a(k) + surfpres * b(k)

Mid layer pressures are calculated by:
pfull(k) = (phalf(k+1)-phalf(k))/log(phalf(k+1)/phalf(k))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure where this formula came from

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zmoon still not sure where this formula came from

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rschwant do you know? I think that you originally did this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raffaele sent it to me to describe the vertical structure in the model when I was helping with the boundary conditions. I'll forward you all the email. This is how he described it to me.

Interface pressure levels are computed using the hybrid interface formula:
p(k) = a(k) + ps * b(k)

where ps is the (actual/reference) surface pressure. These pressure levels correspond to phalf in your output dyn*.nc files, while pfull are computed as:
pfull(k) = (phalf(k+1)-phalf(k))/log(phalf(k+1)/phalf(k))

If there is an official typical way of calculating this we can use that instead too. We haven't really tested this much since we have not used the aircraft evaluation in MELDODIES MONET much in the RRFS-CMAQ model.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll compare the two methods' results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are differences but they are all less than 0.2 hPa in the column I tested.

Differences in Pa:
image

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is how Raffaele calculated it for UFS-Aerosols in the exglobal_aero_init function https://github.com/NOAA-EMC/global-workflow/blob/develop/ush/merge_fv3_aerosol_tile.py#L99-L101

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should use that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is calculation of delta p, similar to dpres in the dataset but not the same:

image

Comment on lines +152 to +161
p = dset.dp_pa.copy().load() # Have to load into memory here so can assign levels.
srfpres = dset.surfpres_pa.copy().load()
for k in range(len(dset.z)):
if surf_only:
pres_2 = 0.0 + srfpres * 0.9978736
pres_1 = 0.0 + srfpres * 1.0
else:
pres_2 = dset.ak[k + 1] + srfpres * dset.bk[k + 1]
pres_1 = dset.ak[k] + srfpres * dset.bk[k]
p[:, k, :, :] = (pres_2 - pres_1) / np.log(pres_2 / pres_1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like this

Suggested change
p = dset.dp_pa.copy().load() # Have to load into memory here so can assign levels.
srfpres = dset.surfpres_pa.copy().load()
for k in range(len(dset.z)):
if surf_only:
pres_2 = 0.0 + srfpres * 0.9978736
pres_1 = 0.0 + srfpres * 1.0
else:
pres_2 = dset.ak[k + 1] + srfpres * dset.bk[k + 1]
pres_1 = dset.ak[k] + srfpres * dset.bk[k]
p[:, k, :, :] = (pres_2 - pres_1) / np.log(pres_2 / pres_1)
surfpres = dset.surfpres_pa
a = dset.ak
b = dset.bk
if surf_only:
phalf_kp1 = 0.0 + surfpres * 0.9978736
phalf_k = 0.0 + surfpres * 1.0
else:
phalf_kp1 = a.shift(z=-1) + surfpres * b.shift(z=-1)
phalf_k = a + surfpres * b
p = (phalf_kp1 - phalf_k) / np.log(phalf_kp1 / phalf_k)

going for consistency with the docstring variable names

Copy link
Member

@zmoon zmoon Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since ak and bk are dataset attrs, not variables

phalf_kp1 = a[1:] + surfpres * b[1:]

but may need to make them DataArrays for the dims to match up properly and such, maybe create

a_k = xr.DataArray(ds.ak[:-1], dims="z")
a_kp1 = xr.DataArray(ds.ak[1:], dims="z")
b_k = xr.DataArray(ds.bk[:-1], dims="z")
b_kp1 = xr.DataArray(ds.bk[1:], dims="z")

@zmoon
Copy link
Member

zmoon commented Dec 19, 2023

@bbakernoaa I made an extraction (forecast hour 6 of the run you shared, 5 levels closest to surface, selected variables, etc.), now available at https://csl.noaa.gov/groups/csl4/modeldata/melodies-monet/data/example_model_data/rrfssd_example/rrfs-sd_dynf006.nc (104M) for testing.

We can set it up like I did here, which also uses data stored in that location.

@bbakernoaa
Copy link
Member Author

@bbakernoaa I made an extraction (forecast hour 6 of the run you shared, 5 levels closest to surface, selected variables, etc.), now available at https://csl.noaa.gov/groups/csl4/modeldata/melodies-monet/data/example_model_data/rrfssd_example/rrfs-sd_dynf006.nc (104M) for testing.

We can set it up like I did here, which also uses data stored in that location.

maybe we can add some netcdf tricks to compress that even further using integers instead of floats and the add_offset and scale_factor netcdf attributes

@zmoon
Copy link
Member

zmoon commented Dec 19, 2023

maybe we can add some netcdf tricks to compress that even further using integers instead of floats and the add_offset and scale_factor netcdf attributes

Perhaps, but I think it is better to keep format closer to the original. And I used NCO lossy compression, which already decreases the size a lot due to quantization, I don't know if additionally doing the int packing transform would make it any more compressible.

Geoptential height with attributes.
"""
sfc = f.surfalt_m.load()
dz = f.dz_m.load() * -1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (loading all dz) is what Maggie's run is failing on, specifically:

Unable to allocate 129. GiB for an array with shape (37, 65, 2961, 4881) and data type float32

Like the pressure calc we should be able to write this in a Dask-friendly way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also like pressure calc, for the surf-only case there is a short version.

# These are negative in RRFS-CMAQ, but you resorted and are adding from the surface,
# so make them positive.
dz[:, 0, :, :] = dz[:, 0, :, :] + sfc # Add the surface altitude to the first model level only
z = dz.rolling(z=len(f.z), min_periods=1).sum()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think .cumsum() could be used.

@zmoon zmoon changed the title add rrfs-sd reader Add RRFS-SD reader Jan 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RRFS-SD Reader
3 participants