-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'develop' into feature/acquisition
- Loading branch information
Showing
95 changed files
with
33,021 additions
and
29 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
# DCAT-US - mdTranslator proposed mappings | ||
## Quick references | ||
- DCAT-US [element definitions](https://resources.data.gov/resources/dcat-us/) | ||
- DCAT-US v1.1 [catalog.json schema](https://resources.data.gov/schemas/dcat-us/v1.1/schema/catalog.json) | ||
- DCAT-US v1.1 [dataset.json schema](https://resources.data.gov/schemas/dcat-us/v1.1/schema/dataset.json) | ||
- DCAT-US v1.1 [JSON-LD catalog.json schema](https://resources.data.gov/schemas/dcat-us/v1.1/schema/catalog.jsonld) | ||
- [Element crosswalks](https://resources.data.gov/resources/podm-field-mapping/#field-mappings) to other standards | ||
|
||
## DCAT-US - mdTranslator | ||
|
||
### Always (always required) | ||
|
||
| Field Name | DCAT Name | Condition | mdJson Source | | ||
| --- | --- | --- | --- | | ||
| Title | title | exists | citation.title | | ||
| Description | description | exists | resourceInfo.abstract | | ||
| Tags | keyword | exists | [resourceInfo.keyword.keyword[0, n] *flatten*] | | ||
| Last Update | modified | if resourceInfo.citation.date[any].dateType = "lastUpdated" or "lastRevised" or "revision" | resourceInfo.citation.date[most recent] | | ||
| Publisher | publisher{name} | if citation.responsibleParty.[any].role = "publisher" | contactId -> contact.name where isOrganization IS TRUE | | ||
| | | if exists resourceDistribution.distributor.contact | [first contact] contactId -> contact.name where isOrganization IS TRUE | | ||
| Publisher Parent Organization | publisher{subOrganizationOf} | if citation.responsibleParty[any].role = "publisher" and exists contactId -> memberOfOrganization[0] and isOrganization is true | contactId -> contact.name | | ||
| | | if exists resourceDistribution.distributor.contact and exists contactId -> memberOfOrganization[0] and isOrganization IS TRUE | contactId -> contact.name | | ||
| Contact Name | contactPoint{fn} | exists | resourceInfo.pointOfContact.parties[0].contactId -> contact.name | | ||
| Contact Email | contactPoint{email} | exists | resourceInfo.pointOfContact.parties[0].contactId -> contact.eMailList[0] | | ||
| Unique Identifier | identifier | if resourceInfo.citation.identifier.namespace = "DOI" | resourceInfo.citation.onlineResource.uri | | ||
| | | if "DOI" within resourceInfo.citation.onlineResource.uri | resourceInfo.citation.onlineResource.uri | | ||
| Public Access Level | accessLevel | [*extend codelist MD_RestrictionCode to include "public", "restricted public", "non-public"*] <br> if resourceInfo.constraints.legal[any] one of {"public", "restricted public", "non-public"} | resourceInfo.constraints.legal[first]. Also resourceInfo.constraint.security.classification [[MD_ClassificationCode](https://mdtools.adiwg.org/#codes-page?c=iso_classification)] | | ||
| Bureau Code | bureauCode | | [*extend role codelist to include "bureau", extend namespace codelist to include "bureauCode"*] <br> for each resourceInfo.citation.responsibleParty[any] role = "bureau" <br>contactId -> contact.identifier [*identifier must conform to https://resources.data.gov/schemas/dcat-us/v1.1/omb_bureau_codes.csv*] | | ||
| Program Code | programCode | | [*add new element of program resourceInfo.programCode, add new codelist of programCode*] <br> resourceInfo.program[0,n] | | ||
|
||
### If-Applicable (required if it exists) | ||
|
||
| Field Name | DCAT Name | Condition | mdJson Source | | ||
| --- | --- | --- | --- | | ||
| Distribution | distribution | if exists resourceDistribution[any] and if exists resourceDistribution.distributor[any].transferOption[any].onlineOption[any].uri <br> for each resourceDistribution[0, n] where exists resourceDistribution.distributor.transferOption.onlineOption.uri then <br> {description, accessURL, downloadURL, mediaType, title} | | ||
| - Description | distribution.description | exists | resourceDistribution.description | | ||
| - AccessURL | distribution.accessURL | if citation.onlineResources[first occurence].uri [path ends in ".html"] [*required if applicable*] | resourceDistribution.distributor.transferOption.onlineOption.uri | | ||
| - DownloadURL | dcat.distribution.downloadURL | if citation.onlineResources[first occurence].uri [path does not end in ".html"] [*required if applicable*] |resourceDistribution.distributor.transferOption.onlineOption.uri | | ||
| - MediaType | distribution.mediaType | [*add codelist of "dataFormat"*] <br> transferOption.distributionFormat.formatSpecification.title [dataFormat] [*dataFormat should conform to: https://www.iana.org/assignments/media-types/media-types.xhtml*] | | ||
| - Title | distribution.title | exists | resourceDistribution.distributor.transferOption.onlineOption.name | | ||
| License | license | [*add resourceInfo.constraint.reference to mdEditor*] <br> if exists resourceInfo.constraint.reference[0] | resourceInfo.constraint.reference[0] <br> | | ||
| | | else | https://creativecommons.org/publicdomain/zero/1.0/ <br> [*allows author to identify a license to use, or default to CC0 if none provided, CC0 would cover international usage as opposed to publicdomain*] <br> [*others: http://www.usa.gov/publicdomain/label/1.0/, http://opendatacommons.org/licenses/pddl/1.0*] | | ||
| Rights | rights | if constraint.accessLevel in {"restricted public", "non-public"} | resourceInfo.constraint.releasibility.statement + " " + each constraint.releasibility.dessiminationConstraint[0, n] | | ||
| Endpoint | *removed* | *ignored* | *ignored* | | ||
| Spatial | spatial | if exists resourceInfo.extents[0].geographicExtents[0].boundingBox | boundingBox.eastLongitude + "," + boundingBox.southLatitude + "," + boundingBox.westLongitude + "," + boundingBox.northLatitude [*decimal degrees*] | | ||
| | | else | if exists resourceInfo.extents[0].geographicExtents[0].geographicElement[0].type = "point" then <br> geographicElement[0].coordinate[1] + "," + geographicElement[0].coordinate[0] [*lat, long decimal degrees*] | | ||
| Temporal | temporal | if exists resourceInfo.extent[0].temporalExtent[0] then <br> if exists tempororalExtent[0].timePeriod.startDate and exists temporaralExtent[0].timePeriod.endDate | timePeriod[0].startDate + "/" + timePeriod.endDate | | ||
| | | if exists tempororalExtent[0].timePeriod.startDate and not exists temporaralExtent[0].timePeriod.endDate | tempororalExtent[0].timePeriod.startDate | | ||
| | | if not exists temporalExtent[0].timePeriod.startDate and exists temporaralExtent[0].timePeriod.endDate | tempororalExtent[0].timePeriod.endDate <br> [*may need revisiting relative to decision on date only formatting*] | | ||
|
||
### No (not required) | ||
|
||
| Field Name | DCAT Name | Condition | mdJson Source | | ||
| --- | --- | --- | --- | | ||
| Release Date | issued | if resourceInfo.citation.date[any].dateType = "publication" or "distributed" | resourceInfo.citation.date[earliest] | | ||
| Frequency | accrualPeriodicity | | [*ISO codelist MD_maintenanceFrequency can be used and several codes intersect with accrualPeriod codelist they are partially corresponding. A column of ISO8601 code equivalents could be added to MD_maintenanceFrequency to provide the coding expected https://resources.data.gov/schemas/dcat-us/v1.1/iso8601_guidance/#accrualperiodicity, community valuation should be determined*] | | ||
| Language | language | | [*language codelist could be used but needs to be bound with country corresponding to the RFC 5646 format https://datatracker.ietf.org/doc/html/rfc5646, such as "en-US", community valuation should be determined* | | ||
| Data Quality | dataQuality | | [*this is a boolean to indicate whether data "conforms" to agency standards, value seems negligble*] | | ||
| Category | theme | where resourceInfo.keyword[any].thesaurus.title = "ISO Topic Category" | [resourceInfo.keyword.keyword[0, n] *flatten*] | | ||
| Related Documents | references | | associatedResource[all].resourceCitation.onlineResource[all].uri + additionalDocumentation[all].citation[all].onlineResource[all].uri [*comma separated*]| | ||
| Homepage URL | landingPage | [*Add code "landingPage" to CI_OnlineFunctionCode*] <br> if resourceInfo.citation.onlineResource[any].function = "landingPage" | resourceInfo.citation.onlineResource.uri | | ||
| Collection | isPartOf | for each associatedResource[0, n].initiativeType = "collection" and associatedResource.associationType = "collectiveTitle" | associatedResource.resourceCitation[0].uri | | ||
| System of Records | systemOfRecords | [*Add code "sorn" to DS_InitiativeTypeCode*] <br> for each associatedResource[0, n].initiativeType = "sorn" | associatedResource.resourceCitation[0].uri | | ||
| Primary IT Investment | primaryITInvestmentUII | | [*Links data to an IT investment identifier relative to Exhibit 53 docs, community valuation should be determined*] | | ||
| Data Dictionary | describedBy | if dataDictionary.dictionaryIncludedWithResource IS NOT TRUE and citation.onlineResource[0].uri exists | dataDictionary.citation.onlineResource[0].uri | | ||
| Data Dictionary Type | describedByType | | [*For simplicity, leave blank implying html page, community decision needed whether to support other format types using mime type and in the form of "application/pdf"*]| | ||
| Data Standard | conformsTo | | [*Currently not able to identify the schema standard the data conforms to, though this has been proposed. Should this be built and there is community decision to support it, then it can be mapped*] | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
require 'jbuilder' | ||
require_relative 'version' | ||
require_relative 'sections/dcat_us_dcat_us' | ||
|
||
module ADIWG | ||
module Mdtranslator | ||
module Writers | ||
module Dcat_us | ||
|
||
def self.startWriter(intObj, responseObj) | ||
# set the contact array for use by the writer | ||
@contacts = intObj[:contacts] | ||
|
||
# set output flag for null properties | ||
Jbuilder.ignore_nil(!responseObj[:writerShowTags]) | ||
|
||
# set the format of the output file based on the writer specified | ||
responseObj[:writerOutputFormat] = 'json' | ||
responseObj[:writerVersion] = ADIWG::Mdtranslator::Writers::Dcat_us::VERSION | ||
|
||
# write the dcat_us metadata record | ||
metadata = Dcat_us.build(intObj, responseObj) | ||
|
||
# set writer pass to true if no messages | ||
# false or warning state will be set by writer code | ||
responseObj[:writerPass] = true if responseObj[:writerMessages].empty? | ||
|
||
# encode the metadata target as JSON | ||
metadata.target! | ||
end | ||
|
||
# find contact in contact array and return the contact hash | ||
def self.get_contact_by_index(contactIndex) | ||
if @contacts[contactIndex] | ||
return @contacts[contactIndex] | ||
end | ||
{} | ||
end | ||
|
||
# find contact in contact array and return the contact hash | ||
def self.get_contact_by_id(contactId) | ||
@contacts.each do |hContact| | ||
if hContact[:contactId] == contactId | ||
return hContact | ||
end | ||
end | ||
{} | ||
end | ||
|
||
# find contact in contact array and return the contact index | ||
def self.get_contact_index_by_id(contactId) | ||
@contacts.each_with_index do |hContact, index| | ||
if hContact[:contactId] == contactId | ||
return index | ||
end | ||
end | ||
{} | ||
end | ||
|
||
# ignore jBuilder object mapping when array is empty | ||
def self.json_map(collection = [], _class) | ||
if collection.nil? || collection.empty? | ||
return nil | ||
else | ||
collection.map { |item| _class.build(item).attributes! } | ||
end | ||
end | ||
|
||
# find all nested objects in 'obj' that contain the element 'ele' | ||
def self.nested_objs_by_element(obj, ele, excludeList = []) | ||
aCollected = [] | ||
obj.each do |key, value| | ||
skipThisOne = false | ||
excludeList.each do |exclude| | ||
if key == exclude.to_sym | ||
skipThisOne = true | ||
end | ||
end | ||
next if skipThisOne | ||
if key == ele.to_sym | ||
aCollected << obj | ||
elsif obj.is_a?(Array) | ||
if key.respond_to?(:each) | ||
aReturn = nested_objs_by_element(key, ele, excludeList) | ||
aCollected = aCollected.concat(aReturn) unless aReturn.empty? | ||
end | ||
elsif obj[key].respond_to?(:each) | ||
aReturn = nested_objs_by_element(value, ele, excludeList) | ||
aCollected = aCollected.concat(aReturn) unless aReturn.empty? | ||
end | ||
end | ||
aCollected | ||
end | ||
|
||
end | ||
end | ||
end | ||
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
|
||
## dcat_us | ||
|
||
### Supported versions | ||
|
||
> 0.0.x (dcat_us is not currently versioned) | ||
### Writer for Data Catalog Vocabulary (DCAT) v1.1 | ||
|
||
|
55 changes: 55 additions & 0 deletions
55
lib/adiwg/mdtranslator/writers/dcat_us/sections/dcat_us_access_level.rb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
require 'jbuilder' | ||
|
||
module ADIWG | ||
module Mdtranslator | ||
module Writers | ||
module Dcat_us | ||
module AccessLevel | ||
|
||
def self.build(intObj) | ||
|
||
publicArray = ['unclassified', 'unrestricted', 'licenseUnrestricted', 'licenseEndUser'] | ||
nonPublicArray = ['restricted','confidential','secret','topSecret','forOfficialUseOnly','protected','intellectualPropertyRights','restricted','otherRestrictions','private','statutory','confidential','traditionalKnowledge','personallyIdentifiableInformation'] | ||
restrictedPublicArray = ['sensitiveButUnclassified','limitedDistribution','copyright','patent','patentPending','trademark','license','licenseDistributor','in-confidence','threatenedOrEndangered'] | ||
|
||
resourceInfo = intObj[:metadata][:resourceInfo] | ||
legalConstraints = resourceInfo[:constraints]&.select { |constraint| constraint[:type] == 'legal' } | ||
securityConstraints = resourceInfo[:constraints]&.select { |constraint| constraint[:type] == 'security' } | ||
|
||
accessLevelCodes = [] | ||
|
||
# Gather codes from security constraints and legal constraints | ||
unless securityConstraints.empty? | ||
securityConstraints.each do |securityConstraint| | ||
code = securityConstraint[:securityConstraint][:classCode] | ||
accessLevelCodes << code | ||
end | ||
end | ||
unless legalConstraints.empty? | ||
legalConstraints.each do |legalConstraint| | ||
codes = legalConstraint.dig(:legalConstraint, :accessCodes) | ||
accessLevelCodes.push(*codes) | ||
end | ||
end | ||
|
||
# return access level that is most restrictive | ||
accessLevelCodes.uniq.each do |code| | ||
if nonPublicArray.include? code | ||
return 'non-public' | ||
end | ||
end | ||
accessLevelCodes.uniq.each do |code| | ||
if restrictedPublicArray.include? code | ||
return 'restricted public' | ||
end | ||
end | ||
|
||
return 'public' | ||
end | ||
|
||
end | ||
end | ||
end | ||
end | ||
end | ||
|
Oops, something went wrong.