-
Notifications
You must be signed in to change notification settings - Fork 26
2025‐01‐07 (CESM Project Meeting)
Michael Levy edited this page Jan 17, 2025
·
1 revision
Teagan and Mike led a discussion on how to organize observational datasets at a CESM Project Meeting. Lots of thoughts:
- No opposition to idea of keeping obs in a centralized location
- Some concern about datasets that we are not allowed to make available publically.
- Most data could be stored in public place, but some in a private directory with similar structure
- Only public data in
key_metrics
, but other examples could rely on private data (may only be available on NCAR machines)
- Some concern about datasets that we are not allowed to make available publically.
- Metadata is very important! Users need to know origin of data, any changes made, etc
- Lots of obs are already in the RDA
- Climate Data Guide, for example
- Keep data in current location, but link to data commons? (will this cause issues with repository?)
- Is inputdata a reasonable place to keep data?
- Some data is already in inputdata (seaice SST dataset is used in seaice notebooks, and is also forcing for F-compsets)
- Should we have a separate repository for processing scripts?
- This is a good idea, but comes with additional cost (someone needs to ensure those scripts continue to work as python evolves)
- OMWG has repository of scripts used for
tx2_3v2
grid: datasets, validation, etc; good example of documentation needed for script repo?
- Data volume a concern?
- Monthly 1° datasets are trivial compared daily (or high resolution spatial data)
- There's a benefit to interpolating datasets to model output, but that would be in addition to native grid; CTSM uses
mksurfdata
for batch interpolation, can CUPiD use something similar?