Skip to content

Commit

Permalink
Merge pull request #202 from NASA-IMPACT/dev
Browse files Browse the repository at this point in the history
Add support for formats (`echo-g`, `umm-g`, `umm-c`) and custom cmr host
  • Loading branch information
slesaad authored Jul 21, 2022
2 parents fe9eab7 + 0d6bcf1 commit a49734c
Show file tree
Hide file tree
Showing 58 changed files with 15,362 additions and 3,284 deletions.
Binary file added .DS_Store
Binary file not shown.
141 changes: 141 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,147 @@
# CHANGELOG

## v1.2.0

- Added support for ECHO10 Granule, UMM-G (UMM-JSON Granule) and UMM-C (UMM-JSON Collection) metadata
- Added support for custom CMR host
- Added support for some UMM fields that look like the following:

```json
"ContactMechanisms": [
{
"Type": "Telephone",
"Value": "605-594-6116"
},
{
"Type": "U.S. toll free",
"Value": "866-573-3222"
},
{
"Type": "Email",
"Value": "[email protected]"
}
]
```

To specify the "Email" field, in the `rule_mapping`, a user would put in `ContactMechanisms/Value?Type=Email` as the field.
- All the field specified in a datetime check that involves comparison should have a corresponding `datetime_format_check` entry, otherwise the check won't run
- Added support for `data` specific to format type. This will take precedence over the generic `data`. Example:

```json
"get_data_url_check": {
"rule_name": "GET DATA URL check",
"fields_to_apply": {
"dif10": [
{
"fields": [
"DIF/Related_URL"
],
"data": [
["URL_Content_Type", "Type"]
]
}
],
"umm-json": [
{
"fields": [
"RelatedUrls"
]
}
]
},
"data": [
["Type"]
],
"severity": "error",
"check_id": "get_data_url_check"
},
```

- Prioritized field dependencies to check dependencies (dependencies from fields take precedence over dependencies from data)
- Added collection `version` to collection datetime validation with granules for accuracy
- Allowed DIF10 datetime fields to support ISO Date (not just ISO Datetime)
- Generalized and renamed `datetime_compare` check to `date_compare`
- Updated auto GCMD keywords downloader to use the new GCMD url
- Addded `pyquarc_errors` to the response, which will contain any errors that were thrown as exceptions during validation
- Added checks that validate granule fields against the corresponding collection fields


### List of added and updated checks

- GET DATA URL Check
- Data Center Long Name Check
- URL Description Uniqueness Check
- Periodic Duration Unit Check
- Characteristic Name Uniqueness Check UMM
- Range Date Time Logic Check
- Range Date Time Logic Check
- Project Date Time Logic Check
- Project Date Time Logic Check
- Periodic Date Time Logic Check
- Datetime ISO Format Check
- URL Health and Status Check
- Delete Time Check
- DOI Missing Reason Enumeration Check
- Processing Level Description Length Check
- UMM Controlled Collection State List
- Ends at present flag logic check
- Ends at present flag presence check
- Data Contact Role Enumeration Check
- Controlled Contact Role Check
- Characteristic Description Length Check
- Organization Longname GCMD Check
- Instrument Short/Longname Consistency Check
- Instrument Shortname GCMD Check
- Instrument Long Name Check
- Platform Shortname GCMD Check
- Data Format GCMD Check
- Platform Longname GCMD Check
- Platform Type GCMD Check
- Campaign Short/Long name consistency Check
- Campaign Short Name GCMD Check
- Campaign Long Name GCMD Check
- Collection Data Type Enumeration Check
- Bounding Coordinates Logic Check
- Vertical Spatial Domain Type Check
- Spatial Coverage Type Check
- Campaign Name Presence Check
- Spatial Extent Requirement Fulfillment Check
- Collection Progress Related Fields Consistency Check
- Online Resource Type GCMD Check
- Characteristic Name Uniqueness Check
- Ending Datetime validation against granules
- Beginning Datetime validation against granules
- ISO Topic Category Vocabulary Check
- Temporal Extent Requirement Check
- FTP Protocol Check
- Citation Version Check
- Default Date Check
- Online Description Presence Check
- IDN Node Shortname GCMD Check
- Chrono Unit GCMD Check
- Platform Type Presence Check
- Horizontal Data Resolution Unit Controlled Vocabulary Check
- Sensor number check
- Data Center Shortname GCMD Check
- Characteristics Data Type Presence Check
- Platform Type Presence Check
- Platform Longname Presence Check
- Granule Platform Short Name Check
- Horizontal Data Resolution Unit Controlled Vocabulary Check
- Periodic Duration Unit Check
- URL Description Uniqueness Check
- Online Resource Description Uniqueness Check
- Online Access Description Uniqueness Check
- Metadata Update Time Logic Check
- Granule Single Date Time Check
- Granule Project Short Name Check
- Granule Sensor Short Name Check
- Validate Granule Data Format Against Collection Check
- Granule Data Format Presence Check


## v1.1.5

- Added reader for specific columns from GCMD csvs
- Fixed bug to handle cases when there are multiple entries for same shortname but the first entry has missing long name

Expand Down
14 changes: 11 additions & 3 deletions CHECKS.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,10 +73,18 @@ Checks if the field value is one of the controlled keywords; provide the control

Checks if the doi provided resolves to a valid document.

#### `presence_check`
#### `one_item_presence_check`

Checks if one of the given fields is populated.

#### `uniqueness_check`

Checks if the field values are unique.

#### `count_check`

Checks if the field value that is a count of fields matches the actual count of the fields.

### Miscellaneous Checks

#### `bounding_coordinate_logic_check`
Expand Down Expand Up @@ -144,7 +152,7 @@ Check to make sure that an OPeNDAP access URL is not provided in the `Online Acc

Check to make sure the fields aren't populated like this:

```
```plaintext
Collection/Contacts/Contact/ContactPersons/ContactPerson/FirstName: "User"
Collection/Contacts/Contact/ContactPersons/ContactPerson/MiddleName: "null"
Collection/Contacts/Contact/ContactPersons/ContactPerson/LastName: "Services"
Expand Down Expand Up @@ -204,7 +212,7 @@ Checks whether the value adheres to GCMD, specifically the project list short na

Checks whether the campaign (project) short name and long name GCMD keywords are consistent: basically that they belong to the same row.

#### `data_center_short_name_gcmd_check`
#### `organization_short_name_gcmd_check`

Checks whether the value adheres to GCMD, specifically the provider list short name column.

Expand Down
Loading

0 comments on commit a49734c

Please sign in to comment.