From e9065b9e3b8a8bd8365b1960eb3657ca410fb4d2 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 27 Nov 2023 11:13:13 -0600 Subject: [PATCH 01/72] `data_dictionary` to `fields` --- .../docs/assets/templates/jsontemplate.md | 2 +- .../jsonschema-csvtemplate-fields.html | 2 +- ...onschema-jsontemplate-data-dictionary.html | 58 +++++++++---------- ...jsonschema-jsontemplate-data-dictionary.md | 2 +- .../examples/valid/template_submission.json | 2 +- .../valid/template_submission_minimal.json | 2 +- .../schemas/dictionary/data-dictionary.yaml | 4 +- .../frictionless/csvtemplate/fields.json | 14 ++--- .../schemas/jsonschema/data-dictionary.json | 4 +- .../templates/template_submission.json | 2 +- 10 files changed, 46 insertions(+), 46 deletions(-) diff --git a/variable-level-metadata-schema/docs/assets/templates/jsontemplate.md b/variable-level-metadata-schema/docs/assets/templates/jsontemplate.md index b5e769a..56c706e 100644 --- a/variable-level-metadata-schema/docs/assets/templates/jsontemplate.md +++ b/variable-level-metadata-schema/docs/assets/templates/jsontemplate.md @@ -5,7 +5,7 @@ {% for itemname,item in schema.properties.items() %} ### `{{ itemname }}` _({{ item.type }}{{ ',required' if itemname in schema.required }})_ {{ item.description }} -{% if itemname == 'data_dictionary' %} +{% if itemname == 'fields' %} {{ item['items']['description'] }} #### Properties for each record {% set schema = item['items'] %} diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 6d9bdda..f972718 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -28,4 +28,4 @@

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: number

Type: number

Type: number

Type: number

Type: number

Type: number

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

Type: number

Type: number

Type: number

Type: number

Type: number

Type: number

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 20acb18..768da2c 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -1,20 +1,20 @@ - Variable Level Metadata (Data Dictionaries)

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables.


Examples:

"Demographics"
-
"PROMIS"
-
"Substance use"
-
"Medical History"
-
"Sleep questions"
-
"Physical activity"
-

Type: string

The name of a variable (i.e., field) as it appears in the data.

Type: string

The human-readable title or label of the variable.


Examples:

"My Variable"
-
"Gender identity"
-

Type: string

An extended description of the variable. This could be the definition of a variable or the
question text (e.g., if a survey).


Examples:

"The participant's age at the time of study enrollment"
-
"What is the highest grade or level of school you have completed or the highest degree you have received?"
-

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Definitions:

  • number (A numeric value with optional decimal places. (e.g., 3.14))
  • integer (A whole number without decimal places. (e.g., 42))
  • string (A sequence of characters. (e.g., \"test\"))
  • any (Any type of data is allowed. (e.g., true))
  • boolean (A binary value representing true or false. (e.g., true))
  • date (A specific calendar date. (e.g., \"2023-05-25\"))
  • datetime (A specific date and time, including timezone information. (e.g., \"2023-05-25T10:30:00Z\"))
  • time (A specific time of day. (e.g., \"10:30:00\"))
  • year (A specific year. (e.g., 2023)
  • yearmonth (A specific year and month. (e.g., \"2023-05\"))
  • duration (A length of time. (e.g., \"PT1H\")
  • geopoint (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278]))

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
For example: If type is "string", then see the String formats.
If type is "date", "datetime", or "time", default format is ISO8601 formatting for those respective types (see details on ISO8601 format for Date,
Datetime,
or Time) - If you want to specify a date-like variable using standard Python/C strptime syntax, see here for details.
See here for more information about appropriate format values by variable type.

[Additional information]

Date Formats (date, datetime, time type variable):

A format for a date variable (date,time,datetime).
default: An ISO8601 format string.
any: Any parsable representation of a date/time/datetime. The implementing library can attempt to parse the datetime via a range of strategies.

{PATTERN}: The value can be parsed according to {PATTERN},
which MUST follow the date formatting syntax of
C / Python strftime such as:

  • "%Y-%m-%d (for date, e.g., 2023-05-25)"
  • "%Y%-%d (for date, e.g., 20230525) for date without dashes"
  • "%Y-%m-%dT%H:%M:%S (for datetime, e.g., 2023-05-25T10:30:45)"
  • "%Y-%m-%dT%H:%M:%SZ (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)"
  • "%Y-%m-%dT%H:%M:%S%z (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)"
  • "%Y-%m-%dT%H:%M (for datetime without seconds, e.g., 2023-05-25T10:30)"
  • "%Y-%m-%dT%H (for datetime without minutes and seconds, e.g., 2023-05-25T10)"
  • "%H:%M:%S (for time, e.g., 10:30:45)"
  • "%H:%M:%SZ (for time with UTC timezone, e.g., 10:30:45Z)"
  • "%H:%M:%S%z (for time with timezone offset, e.g., 10:30:45+0300)"

String formats:

  • "email if valid emails (e.g., test@gmail.com)"
  • "uri if valid uri addresses (e.g., https://example.com/resource123)"
  • "binary if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)"
  • "uuid if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)"

Geopoint formats:

The two types of formats for geopoint (describing a geographic point).

  • array (if 'lat,long' (e.g., 36.63,-90.20))
  • object (if {'lat':36.63,'lon':-90.20})

Type: object

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: array

Constrains possible values to a set of values.


Examples:

[
+ Variable Level Metadata (Data Dictionaries) 

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables.


Examples:

"Demographics"
+
"PROMIS"
+
"Substance use"
+
"Medical History"
+
"Sleep questions"
+
"Physical activity"
+

Type: string

The name of a variable (i.e., field) as it appears in the data.

Type: string

The human-readable title or label of the variable.


Examples:

"My Variable"
+
"Gender identity"
+

Type: string

An extended description of the variable. This could be the definition of a variable or the
question text (e.g., if a survey).


Examples:

"The participant's age at the time of study enrollment"
+
"What is the highest grade or level of school you have completed or the highest degree you have received?"
+

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Definitions:

  • number (A numeric value with optional decimal places. (e.g., 3.14))
  • integer (A whole number without decimal places. (e.g., 42))
  • string (A sequence of characters. (e.g., \"test\"))
  • any (Any type of data is allowed. (e.g., true))
  • boolean (A binary value representing true or false. (e.g., true))
  • date (A specific calendar date. (e.g., \"2023-05-25\"))
  • datetime (A specific date and time, including timezone information. (e.g., \"2023-05-25T10:30:00Z\"))
  • time (A specific time of day. (e.g., \"10:30:00\"))
  • year (A specific year. (e.g., 2023)
  • yearmonth (A specific year and month. (e.g., \"2023-05\"))
  • duration (A length of time. (e.g., \"PT1H\")
  • geopoint (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278]))

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
For example: If type is "string", then see the String formats.
If type is "date", "datetime", or "time", default format is ISO8601 formatting for those respective types (see details on ISO8601 format for Date,
Datetime,
or Time) - If you want to specify a date-like variable using standard Python/C strptime syntax, see here for details.
See here for more information about appropriate format values by variable type.

[Additional information]

Date Formats (date, datetime, time type variable):

A format for a date variable (date,time,datetime).
default: An ISO8601 format string.
any: Any parsable representation of a date/time/datetime. The implementing library can attempt to parse the datetime via a range of strategies.

{PATTERN}: The value can be parsed according to {PATTERN},
which MUST follow the date formatting syntax of
C / Python strftime such as:

  • "%Y-%m-%d (for date, e.g., 2023-05-25)"
  • "%Y%-%d (for date, e.g., 20230525) for date without dashes"
  • "%Y-%m-%dT%H:%M:%S (for datetime, e.g., 2023-05-25T10:30:45)"
  • "%Y-%m-%dT%H:%M:%SZ (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)"
  • "%Y-%m-%dT%H:%M:%S%z (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)"
  • "%Y-%m-%dT%H:%M (for datetime without seconds, e.g., 2023-05-25T10:30)"
  • "%Y-%m-%dT%H (for datetime without minutes and seconds, e.g., 2023-05-25T10)"
  • "%H:%M:%S (for time, e.g., 10:30:45)"
  • "%H:%M:%SZ (for time with UTC timezone, e.g., 10:30:45Z)"
  • "%H:%M:%S%z (for time with timezone offset, e.g., 10:30:45+0300)"

String formats:

  • "email if valid emails (e.g., test@gmail.com)"
  • "uri if valid uri addresses (e.g., https://example.com/resource123)"
  • "binary if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)"
  • "uuid if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)"

Geopoint formats:

The two types of formats for geopoint (describing a geographic point).

  • array (if 'lat,long' (e.g., 36.63,-90.20))
  • object (if {'lat':36.63,'lon':-90.20})

Type: object

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: array

Constrains possible values to a set of values.


Examples:

[
     1,
     2,
     3,
     4
 ]
-
[
+
[
     "White",
     "Black or African American",
     "American Indian or Alaska Native",
@@ -23,39 +23,39 @@
     "Some other race",
     "Multiracial"
 ]
-

Type: string

A regular expression pattern the data MUST conform to.

Type: integer

Specifies the maximum value of a field (e.g., maximum -- or most
recent -- date, maximum integer etc). Note, this is different then
maxLength property.

Type: integer

Specifies the minimum value of a field.

Type: object

Variable value encodings provide a way to further annotate any value within a any variable type,
making values easier to understand.

Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms
only support numerical values. Encodings (and mappings) allow categorical values to be stored as
numerical values.

Additionally, as another use case, this field provides a way to
store categoricals that are stored as "short" labels (such as
abbreviations).


Examples:

{
+

Type: string

A regular expression pattern the data MUST conform to.

Type: integer

Specifies the maximum value of a field (e.g., maximum -- or most
recent -- date, maximum integer etc). Note, this is different then
maxLength property.

Type: integer

Specifies the minimum value of a field.

Type: object

Variable value encodings provide a way to further annotate any value within a any variable type,
making values easier to understand.

Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms
only support numerical values. Encodings (and mappings) allow categorical values to be stored as
numerical values.

Additionally, as another use case, this field provides a way to
store categoricals that are stored as "short" labels (such as
abbreviations).


Examples:

{
     "0": "No",
     "1": "Yes"
 }
-
{
+
{
     "HW": "Hello world",
     "GBW": "Good bye world",
     "HM": "Hi, Mike"
 }
-

Type: boolean

Indicates whether a categorical variable is ordered. This variable is
relevant for variables that have an ordered relationship but not
necessarily a numerical relationship (e.g., Strongly disagree < Disagree
< Neutral < Agree).

Type: array

A list of missing values specific to a variable.


Examples:

[
+

Type: boolean

Indicates whether a categorical variable is ordered. This variable is
relevant for variables that have an ordered relationship but not
necessarily a numerical relationship (e.g., Strongly disagree < Disagree
< Neutral < Agree).

Type: array

A list of missing values specific to a variable.


Examples:

[
     "Missing",
     "Skipped",
     "No preference"
 ]
-
[
+
[
     "Missing"
 ]
-

Type: array of string

For boolean (true) variable (as defined in type field), this field allows
a physical string representation to be cast as true (increasing
readability of the field). It can include one or more values.

Each item of this array must be:


Examples:

[
+

Type: array of string

For boolean (true) variable (as defined in type field), this field allows
a physical string representation to be cast as true (increasing
readability of the field). It can include one or more values.

Each item of this array must be:


Examples:

[
     "required",
     "Yes",
     "Checked"
 ]
-
[
+
[
     "required"
 ]
-

Type: array

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Type: array of object

A published set of standard variables such as the NIH Common Data Elements program.

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
-

Type: string

The type of mapping linked to a published set of standard variables such as the NIH Common Data Elements program


Examples:

"cde"
-
"ontology"
-
"reference_list"
-

Type: string

A free text label of a mapping indicating a mapping(s) to a published set of standard variables such as the NIH Common Data Elements program.


Examples:

"substance use"
-
"chemical compound"
-
"promis"
-

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: array of object

Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
-

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: object

Univariate statistics inferred from the data about the given variable

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file +

Type: array

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Type: array of object

A published set of standard variables such as the NIH Common Data Elements program.

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
+

Type: string

The type of mapping linked to a published set of standard variables such as the NIH Common Data Elements program


Examples:

"cde"
+
"ontology"
+
"reference_list"
+

Type: string

A free text label of a mapping indicating a mapping(s) to a published set of standard variables such as the NIH Common Data Elements program.


Examples:

"substance use"
+
"chemical compound"
+
"promis"
+

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
+

Type: string

The id locating the individual mapping within the given source.

Type: array of object

Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
+

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
+

Type: string

The id locating the individual mapping within the given source.

Type: object

Univariate statistics inferred from the data about the given variable

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index 0727a97..b538ec0 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -6,7 +6,7 @@ This schema defines the variable level metadata for one data dictionary for a gi ### `description` _(string)_ -### `data_dictionary` _(array,required)_ +### `fields` _(array,required)_ Variable level metadata individual fields integrated into the variable level metadata object within the HEAL platform metadata service. diff --git a/variable-level-metadata-schema/examples/valid/template_submission.json b/variable-level-metadata-schema/examples/valid/template_submission.json index 3aa31e5..5ae5b19 100644 --- a/variable-level-metadata-schema/examples/valid/template_submission.json +++ b/variable-level-metadata-schema/examples/valid/template_submission.json @@ -1,7 +1,7 @@ { "title": "Example VLMD", "description": "This is an example description", - "data_dictionary": [ + "fields": [ { "module": "Enrollment", "name": "participant_id", diff --git a/variable-level-metadata-schema/examples/valid/template_submission_minimal.json b/variable-level-metadata-schema/examples/valid/template_submission_minimal.json index 21b993d..62c4f4f 100644 --- a/variable-level-metadata-schema/examples/valid/template_submission_minimal.json +++ b/variable-level-metadata-schema/examples/valid/template_submission_minimal.json @@ -1,7 +1,7 @@ { "title": "Minimal Example VLMD", "description": "This is an minimally filled out template", - "data_dictionary": [ + "fields": [ { "name": "participant_id", "description": "Unique identifier for participant", diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index cc83799..ef69927 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -6,13 +6,13 @@ description: This schema defines the variable level metadata for one data dictio type: object required: - title -- data_dictionary +- fields properties: title: type: string description: type: string - data_dictionary: + fields: type: array items: $ref: "#/fields" \ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 68a981f..df7e4fb 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -56,18 +56,18 @@ "type": "string", "constraints": { "enum": [ + "duration", "number", - "datetime", + "any", "date", + "integer", + "boolean", "string", - "any", - "year", "geopoint", + "year", "time", - "integer", - "yearmonth", - "duration", - "boolean" + "datetime", + "yearmonth" ] } }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 19d6a51..d9bbe10 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -7,7 +7,7 @@ "type": "object", "required": [ "title", - "data_dictionary" + "fields" ], "properties": { "title": { @@ -16,7 +16,7 @@ "description": { "type": "string" }, - "data_dictionary": { + "fields": { "type": "array", "items": { "$schema": "http://json-schema.org/draft-04/schema#", diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index fe4c050..c68fb2e 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -2,7 +2,7 @@ { "title": null, "description": null, - "data_dictionary": [ + "fields": [ { "module": null, "name": null, From f2fbcdeba8688dea364bac33938df8dca61d2aa0 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 27 Nov 2023 11:23:52 -0600 Subject: [PATCH 02/72] del `repo_link` but see issues #28 --- .../jsonschema-csvtemplate-fields.html | 4 ++-- ...onschema-jsontemplate-data-dictionary.html | 4 ++-- .../jsonschema-csvtemplate-fields.md | 4 ---- ...jsonschema-jsontemplate-data-dictionary.md | 4 ---- .../schemas/dictionary/fields.yaml | 5 ----- .../frictionless/csvtemplate/fields.json | 20 +++++++------------ .../jsonschema/csvtemplate/fields.json | 5 ----- .../schemas/jsonschema/data-dictionary.json | 5 ----- .../templates/template_submission.csv | 2 +- .../templates/template_submission.json | 1 - 10 files changed, 12 insertions(+), 42 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index f972718..223030d 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -18,7 +18,7 @@
"required|Yes|Y|Checked"
 
"Checked"
 
"Required"
-

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
+

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping linked to a published set of standard variables such as the NIH Common Data Elements program


Examples:

"cde"
 
"ontology"
 
"reference_list"
@@ -28,4 +28,4 @@
 

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: number

Type: number

Type: number

Type: number

Type: number

Type: number

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

Type: number

Type: number

Type: number

Type: number

Type: number

Type: number

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 768da2c..428b8ae 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -48,7 +48,7 @@
[
     "required"
 ]
-

Type: array

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Type: array of object

A published set of standard variables such as the NIH Common Data Elements program.

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
+

Type: array

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Type: array of object

A published set of standard variables such as the NIH Common Data Elements program.

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping linked to a published set of standard variables such as the NIH Common Data Elements program


Examples:

"cde"
 
"ontology"
 
"reference_list"
@@ -58,4 +58,4 @@
 

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: array of object

Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: object

Univariate statistics inferred from the data about the given variable

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

Type: object

Univariate statistics inferred from the data about the given variable

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index 67aed35..403739f 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -337,10 +337,6 @@ a physical string representation to be cast as false (increasing readability of the field) that is not a standard false value. It can include one or more values. -**`repo_link`** _(string)_ - A link to the variable as it exists on the home repository, if applicable - - **`standardsMappings.url`** _(string)_ The url that links out to the published, standardized mapping. diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index b538ec0..e42131d 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -350,10 +350,6 @@ a physical string representation to be cast as false (increasing readability of the field) that is not a standard false value. It can include one or more values. -**`repo_link`** _(string)_ - A link to the variable as it exists on the home repository, if applicable - - **`standardsMappings`** _(array)_ A published set of standard variables such as the NIH Common Data Elements program. diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 6053edd..b496399 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -298,11 +298,6 @@ properties: a physical string representation to be cast as false (increasing readability of the field) that is not a standard false value. It can include one or more values. $ref: "#/definitions/csvArray" - repo_link: - type: string - title: Variable Repository Link - description: | - A link to the variable as it exists on the home repository, if applicable standardsMappings: title: Standards Mappings description: A published set of standard variables such as the NIH Common Data Elements program. diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index df7e4fb..7620ae5 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -56,18 +56,18 @@ "type": "string", "constraints": { "enum": [ - "duration", + "date", + "time", "number", "any", - "date", - "integer", - "boolean", + "duration", "string", + "datetime", + "yearmonth", "geopoint", + "integer", "year", - "time", - "datetime", - "yearmonth" + "boolean" ] } }, @@ -169,12 +169,6 @@ "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" } }, - { - "name": "repo_link", - "description": "A link to the variable as it exists on the home repository, if applicable\n", - "title": "Variable Repository Link", - "type": "string" - }, { "name": "standardsMappings.url", "description": "The url that links out to the published, standardized mapping.\n", diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index ac350ad..9a44fdc 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -142,11 +142,6 @@ "type": "string", "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" }, - "repo_link": { - "type": "string", - "title": "Variable Repository Link", - "description": "A link to the variable as it exists on the home repository, if applicable\n" - }, "standardsMappings.url": { "title": "Standards Mapping - Url", "description": "The url that links out to the published, standardized mapping.\n", diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index d9bbe10..f49e60e 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -196,11 +196,6 @@ "description": "For boolean (false) variable (as defined in type field), this field allows\na physical string representation to be cast as false (increasing\nreadability of the field) that is not a standard false value. It can include one or more values.\n", "type": "array" }, - "repo_link": { - "type": "string", - "title": "Variable Repository Link", - "description": "A link to the variable as it exists on the home repository, if applicable\n" - }, "standardsMappings": { "title": "Standards Mappings", "description": "A published set of standard variables such as the NIH Common Data Elements program.", diff --git a/variable-level-metadata-schema/templates/template_submission.csv b/variable-level-metadata-schema/templates/template_submission.csv index f4a25d6..9bad35b 100644 --- a/variable-level-metadata-schema/templates/template_submission.csv +++ b/variable-level-metadata-schema/templates/template_submission.csv @@ -1 +1 @@ -module,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,repo_link,standardsMappings.url,standardsMappings.type,standardsMappings.label,standardsMappings.source,standardsMappings.id,relatedConcepts.url,relatedConcepts.type,relatedConcepts.label,relatedConcepts.source,relatedConcepts.id,univarStats.median,univarStats.mean,univarStats.std,univarStats.min,univarStats.max,univarStats.mode,univarStats.count,univarStats.twentyFifthPercentile,univarStats.seventyFifthPercentile,univarStats.categoricalMarginals.name,univarStats.categoricalMarginals.count \ No newline at end of file +module,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,standardsMappings.url,standardsMappings.type,standardsMappings.label,standardsMappings.source,standardsMappings.id,relatedConcepts.url,relatedConcepts.type,relatedConcepts.label,relatedConcepts.source,relatedConcepts.id,univarStats.median,univarStats.mean,univarStats.std,univarStats.min,univarStats.max,univarStats.mode,univarStats.count,univarStats.twentyFifthPercentile,univarStats.seventyFifthPercentile,univarStats.categoricalMarginals.name,univarStats.categoricalMarginals.count \ No newline at end of file diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index c68fb2e..43804e7 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -24,7 +24,6 @@ {} ], "falseValues": [], - "repo_link": null, "standardsMappings": [ { "url": null, From 68ab3620db5bb46c1eb7259d89f20362a8a264cb Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 27 Nov 2023 11:33:32 -0600 Subject: [PATCH 03/72] `module` to `section` as addressed in #40 --- .../jsonschema-csvtemplate-fields.html | 14 +++++++------- ...sonschema-jsontemplate-data-dictionary.html | 14 +++++++------- .../jsonschema-csvtemplate-fields.md | 2 +- .../jsonschema-jsontemplate-data-dictionary.md | 2 +- .../schemas/dictionary/fields.yaml | 2 +- .../frictionless/csvtemplate/fields.json | 18 +++++++++--------- .../schemas/jsonschema/csvtemplate/fields.json | 2 +- .../schemas/jsonschema/data-dictionary.json | 2 +- .../templates/template_submission.csv | 2 +- .../templates/template_submission.json | 2 +- 10 files changed, 30 insertions(+), 30 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 223030d..1f29367 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -1,9 +1,9 @@ - HEAL Variable Level Metadata Fields

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables.


Examples:

"Demographics"
-
"PROMIS"
-
"Substance use"
-
"Medical History"
-
"Sleep questions"
-
"Physical activity"
+ HEAL Variable Level Metadata Fields 

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables.


Examples:

"Demographics"
+
"PROMIS"
+
"Substance use"
+
"Medical History"
+
"Sleep questions"
+
"Physical activity"
 

Type: string

The name of a variable (i.e., field) as it appears in the data.

Type: string

The human-readable title or label of the variable.


Examples:

"My Variable"
 
"Gender identity"
 

Type: string

An extended description of the variable. This could be the definition of a variable or the
question text (e.g., if a survey).


Examples:

"The participant's age at the time of study enrollment"
@@ -28,4 +28,4 @@
 

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: number

Type: number

Type: number

Type: number

Type: number

Type: number

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

Type: number

Type: number

Type: number

Type: number

Type: number

Type: number

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 428b8ae..55890b8 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -1,9 +1,9 @@ - Variable Level Metadata (Data Dictionaries)

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables.


Examples:

"Demographics"
-
"PROMIS"
-
"Substance use"
-
"Medical History"
-
"Sleep questions"
-
"Physical activity"
+ Variable Level Metadata (Data Dictionaries) 

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables.


Examples:

"Demographics"
+
"PROMIS"
+
"Substance use"
+
"Medical History"
+
"Sleep questions"
+
"Physical activity"
 

Type: string

The name of a variable (i.e., field) as it appears in the data.

Type: string

The human-readable title or label of the variable.


Examples:

"My Variable"
 
"Gender identity"
 

Type: string

An extended description of the variable. This could be the definition of a variable or the
question text (e.g., if a survey).


Examples:

"The participant's age at the time of study enrollment"
@@ -58,4 +58,4 @@
 

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: array of object

Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: object

Univariate statistics inferred from the data about the given variable

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

Type: object

Univariate statistics inferred from the data about the given variable

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index 403739f..964d0b4 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -14,7 +14,7 @@ metadata object within the HEAL platform metadata service. ## Properties -**`module`** _(string)_ +**`section`** _(string)_ The section, form, survey instrument, set of measures or other broad category used to group variables. diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index e42131d..402f700 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -20,7 +20,7 @@ metadata object within the HEAL platform metadata service. #### Properties for each record -**`module`** _(string)_ +**`section`** _(string)_ The section, form, survey instrument, set of measures or other broad category used to group variables. diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index b496399..be591f8 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -18,7 +18,7 @@ required: - name - description properties: - module: + section: type: string title: Module description: | diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 7620ae5..8f844d6 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -4,7 +4,7 @@ "title": "HEAL Variable Level Metadata Fields", "fields": [ { - "name": "module", + "name": "section", "description": "The section, form, survey instrument, set of measures or other broad category used \nto group variables.\n", "title": "Module", "examples": [ @@ -56,18 +56,18 @@ "type": "string", "constraints": { "enum": [ - "date", "time", - "number", - "any", - "duration", + "date", "string", - "datetime", + "any", + "boolean", + "year", + "integer", "yearmonth", + "number", "geopoint", - "integer", - "year", - "boolean" + "datetime", + "duration" ] } }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index 9a44fdc..020c51d 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -10,7 +10,7 @@ "description" ], "properties": { - "module": { + "section": { "type": "string", "title": "Module", "description": "The section, form, survey instrument, set of measures or other broad category used \nto group variables.\n", diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index f49e60e..602b643 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -30,7 +30,7 @@ "description" ], "properties": { - "module": { + "section": { "type": "string", "title": "Module", "description": "The section, form, survey instrument, set of measures or other broad category used \nto group variables.\n", diff --git a/variable-level-metadata-schema/templates/template_submission.csv b/variable-level-metadata-schema/templates/template_submission.csv index 9bad35b..69767dc 100644 --- a/variable-level-metadata-schema/templates/template_submission.csv +++ b/variable-level-metadata-schema/templates/template_submission.csv @@ -1 +1 @@ -module,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,standardsMappings.url,standardsMappings.type,standardsMappings.label,standardsMappings.source,standardsMappings.id,relatedConcepts.url,relatedConcepts.type,relatedConcepts.label,relatedConcepts.source,relatedConcepts.id,univarStats.median,univarStats.mean,univarStats.std,univarStats.min,univarStats.max,univarStats.mode,univarStats.count,univarStats.twentyFifthPercentile,univarStats.seventyFifthPercentile,univarStats.categoricalMarginals.name,univarStats.categoricalMarginals.count \ No newline at end of file +section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,standardsMappings.url,standardsMappings.type,standardsMappings.label,standardsMappings.source,standardsMappings.id,relatedConcepts.url,relatedConcepts.type,relatedConcepts.label,relatedConcepts.source,relatedConcepts.id,univarStats.median,univarStats.mean,univarStats.std,univarStats.min,univarStats.max,univarStats.mode,univarStats.count,univarStats.twentyFifthPercentile,univarStats.seventyFifthPercentile,univarStats.categoricalMarginals.name,univarStats.categoricalMarginals.count \ No newline at end of file diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index 43804e7..9993f6e 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -4,7 +4,7 @@ "description": null, "fields": [ { - "module": null, + "section": null, "name": null, "title": null, "description": null, From 201e6ab9b3a4c67937d564965c5368e5216837ff Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 27 Nov 2023 11:38:39 -0600 Subject: [PATCH 04/72] Del additional properties to enable validation of property names --- .../jsonschema-csvtemplate-fields.html | 2 +- .../jsonschema-jsontemplate-data-dictionary.html | 2 +- .../schemas/dictionary/fields.yaml | 1 - .../schemas/frictionless/csvtemplate/fields.json | 14 +++++++------- .../schemas/jsonschema/csvtemplate/fields.json | 1 - .../schemas/jsonschema/data-dictionary.json | 1 - 6 files changed, 9 insertions(+), 12 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 1f29367..5c22359 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -28,4 +28,4 @@

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: number

Type: number

Type: number

Type: number

Type: number

Type: number

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

Type: number

Type: number

Type: number

Type: number

Type: number

Type: number

Type: integer

Value must be greater or equal to 0

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 55890b8..cd473dc 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -58,4 +58,4 @@

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: array of object

Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: object

Univariate statistics inferred from the data about the given variable

Type: integer

Value must be greater or equal to 0

Additional Properties of any type are allowed.

Type: object
\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

Type: object

Univariate statistics inferred from the data about the given variable

Type: integer

Value must be greater or equal to 0

\ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index be591f8..761f705 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -13,7 +13,6 @@ description: | `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables) type: object -additionalProperties: true required: - name - description diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 8f844d6..84f907e 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -56,18 +56,18 @@ "type": "string", "constraints": { "enum": [ - "time", "date", - "string", + "year", + "geopoint", "any", "boolean", - "year", - "integer", "yearmonth", + "string", "number", - "geopoint", - "datetime", - "duration" + "time", + "integer", + "duration", + "datetime" ] } }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index 020c51d..b6d2aa5 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -4,7 +4,6 @@ "title": "HEAL Variable Level Metadata Fields", "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"NOTE\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `encodings` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "type": "object", - "additionalProperties": true, "required": [ "name", "description" diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 602b643..617da13 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -24,7 +24,6 @@ "title": "HEAL Variable Level Metadata Fields", "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"NOTE\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `encodings` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "type": "object", - "additionalProperties": true, "required": [ "name", "description" From 0a2e930f46af99353f206787ac844121321c2b87 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Thu, 30 Nov 2023 12:05:42 -0600 Subject: [PATCH 05/72] update section title --- .../jsonschema-csvtemplate-fields.html | 4 ++-- ...sonschema-jsontemplate-data-dictionary.html | 4 ++-- .../jsonschema-csvtemplate-fields.md | 2 +- .../jsonschema-jsontemplate-data-dictionary.md | 2 +- .../schemas/dictionary/fields.yaml | 4 ++-- .../frictionless/csvtemplate/fields.json | 18 +++++++++--------- .../schemas/jsonschema/csvtemplate/fields.json | 4 ++-- .../schemas/jsonschema/data-dictionary.json | 4 ++-- 8 files changed, 21 insertions(+), 21 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 5c22359..f0c5ac6 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -1,4 +1,4 @@ - HEAL Variable Level Metadata Fields

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables.


Examples:

"Demographics"
+ HEAL Variable Level Metadata Fields 

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
 
"Substance use"
 
"Medical History"
@@ -28,4 +28,4 @@
 

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: number

Type: number

Type: number

Type: number

Type: number

Type: number

Type: integer

Value must be greater or equal to 0

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

Type: number

Type: number

Type: number

Type: number

Type: number

Type: number

Type: integer

Value must be greater or equal to 0

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index cd473dc..9096cf9 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -1,4 +1,4 @@ - Variable Level Metadata (Data Dictionaries)

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables.


Examples:

"Demographics"
+ Variable Level Metadata (Data Dictionaries) 

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
 
"Substance use"
 
"Medical History"
@@ -58,4 +58,4 @@
 

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: array of object

Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: object

Univariate statistics inferred from the data about the given variable

Type: integer

Value must be greater or equal to 0

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

Type: object

Univariate statistics inferred from the data about the given variable

Type: integer

Value must be greater or equal to 0

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index 964d0b4..d882d75 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -16,7 +16,7 @@ metadata object within the HEAL platform metadata service. **`section`** _(string)_ The section, form, survey instrument, set of measures or other broad category used -to group variables. +to group variables. Previously called "module." Examples: diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index 402f700..b9ac8b9 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -22,7 +22,7 @@ metadata object within the HEAL platform metadata service. **`section`** _(string)_ The section, form, survey instrument, set of measures or other broad category used -to group variables. +to group variables. Previously called "module." Examples: diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 761f705..46b32d7 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -19,10 +19,10 @@ required: properties: section: type: string - title: Module + title: Section description: | The section, form, survey instrument, set of measures or other broad category used - to group variables. + to group variables. Previously called "module." examples: - Demographics diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 84f907e..504ff84 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -5,8 +5,8 @@ "fields": [ { "name": "section", - "description": "The section, form, survey instrument, set of measures or other broad category used \nto group variables.\n", - "title": "Module", + "description": "The section, form, survey instrument, set of measures or other broad category used \nto group variables. Previously called \"module.\"\n", + "title": "Section", "examples": [ "Demographics", "PROMIS", @@ -56,17 +56,17 @@ "type": "string", "constraints": { "enum": [ - "date", - "year", - "geopoint", - "any", - "boolean", "yearmonth", + "any", + "date", "string", - "number", - "time", "integer", + "geopoint", "duration", + "time", + "year", + "boolean", + "number", "datetime" ] } diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index b6d2aa5..0091256 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -11,8 +11,8 @@ "properties": { "section": { "type": "string", - "title": "Module", - "description": "The section, form, survey instrument, set of measures or other broad category used \nto group variables.\n", + "title": "Section", + "description": "The section, form, survey instrument, set of measures or other broad category used \nto group variables. Previously called \"module.\"\n", "examples": [ "Demographics", "PROMIS", diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 617da13..130ede8 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -31,8 +31,8 @@ "properties": { "section": { "type": "string", - "title": "Module", - "description": "The section, form, survey instrument, set of measures or other broad category used \nto group variables.\n", + "title": "Section", + "description": "The section, form, survey instrument, set of measures or other broad category used \nto group variables. Previously called \"module.\"\n", "examples": [ "Demographics", "PROMIS", From b158581338ac5e1ece43653e9db6e1769cacfb82 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 1 Dec 2023 13:50:09 -0600 Subject: [PATCH 06/72] updated examples with section prop --- .../examples/valid/template_submission.csv | 2 +- .../examples/valid/template_submission.json | 14 +++++++------- .../examples/valid/template_submission_minimal.csv | 2 +- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/variable-level-metadata-schema/examples/valid/template_submission.csv b/variable-level-metadata-schema/examples/valid/template_submission.csv index 3e27439..6aa2ee5 100644 --- a/variable-level-metadata-schema/examples/valid/template_submission.csv +++ b/variable-level-metadata-schema/examples/valid/template_submission.csv @@ -1,4 +1,4 @@ -module,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,repo_link,standardsMappings.type,standardsMappings.label,standardsMappings.url,standardsMappings.source,standardsMappings.id,relatedConcepts.type,relatedConcepts.label,relatedConcepts.url,relatedConcepts.source,relatedConcepts.id,univarStats.median,univarStats.mean,univarStats.std,univarStats.min,univarStats.max,univarStats.mode,univarStats.count,univarStats.twentyFifthPercentile,univarStats.seventyFifthPercentile,univarStats.categoricalMarginals.name,univarStats.categoricalMarginals.count +section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,repo_link,standardsMappings.type,standardsMappings.label,standardsMappings.url,standardsMappings.source,standardsMappings.id,relatedConcepts.type,relatedConcepts.label,relatedConcepts.url,relatedConcepts.source,relatedConcepts.id,univarStats.median,univarStats.mean,univarStats.std,univarStats.min,univarStats.max,univarStats.mode,univarStats.count,univarStats.twentyFifthPercentile,univarStats.seventyFifthPercentile,univarStats.categoricalMarginals.name,univarStats.categoricalMarginals.count Enrollment,participant_id,Participant Id,Unique identifier for participant,string,,,,[A-Z][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9],,,,,,,,,,,,,,,,,,,,,,,,,,,,, Demographics,race,Race,Self-reported race,integer,,,1|2|3|4|5|6|7|8,,,,1=White|2=Black or African American|3=American Indian or Alaska Native|4=Native| 5=Hawaiian or Other Pacific Islander|6=Asian|7=Some other race|8=Multiracial|99=Not reported,,99,,,,cde|cde,NLM race,,NLM|NLM,Fakc6Jy2x|m1_atF7L7U,,,,,,,,,,,,,,,, Demographics,age,Age,What is your age? (age at enrollment),integer,,,,,90,0,,,,,,,,,,,,,,,,,,,,,,,,,,, diff --git a/variable-level-metadata-schema/examples/valid/template_submission.json b/variable-level-metadata-schema/examples/valid/template_submission.json index 5ae5b19..380428f 100644 --- a/variable-level-metadata-schema/examples/valid/template_submission.json +++ b/variable-level-metadata-schema/examples/valid/template_submission.json @@ -3,7 +3,7 @@ "description": "This is an example description", "fields": [ { - "module": "Enrollment", + "section": "Enrollment", "name": "participant_id", "title": "Participant Id", "description": "Unique identifier for participant", @@ -13,7 +13,7 @@ } }, { - "module": "Demographics", + "section": "Demographics", "name": "race", "title": "Race", "description": "Self-reported race", @@ -59,7 +59,7 @@ ] }, { - "module": "Demographics", + "section": "Demographics", "name": "age", "title": "Age", "description": "What is your age? (age at enrollment)", @@ -70,7 +70,7 @@ } }, { - "module": "Demographics", + "section": "Demographics", "name": "hispanic", "title": "Hispanic, Latino, or Spanish Origin", "description": "Are you of Hispanic, Latino, or Spanish origin?", @@ -86,7 +86,7 @@ ] }, { - "module": "Demographics", + "section": "Demographics", "name": "sex_at_birth", "title": "Sex at Birth", "description": "The self-reported sex of the participant/subject at birth", @@ -107,7 +107,7 @@ ] }, { - "module": "Substance Use", + "section": "Substance Use", "name": "SU4", "title": "Heroin Days Used", "description": "During the past 30 days how many days did you use heroin (alone or mixed with other drugs)? ] [Write 0 days if no use]", @@ -128,7 +128,7 @@ ] }, { - "module": "Biomeasures", + "section": "Biomeasures", "name": "pulse_rate", "title": "Pulse Rate", "description": "Heart rate measured at systemic artery", diff --git a/variable-level-metadata-schema/examples/valid/template_submission_minimal.csv b/variable-level-metadata-schema/examples/valid/template_submission_minimal.csv index 6815bd3..e7fc476 100644 --- a/variable-level-metadata-schema/examples/valid/template_submission_minimal.csv +++ b/variable-level-metadata-schema/examples/valid/template_submission_minimal.csv @@ -1,4 +1,4 @@ -module,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,repo_link,standardsMappings.type,standardsMappings.label,standardsMappings.url,standardsMappings.source,standardsMappings.id,relatedConcepts.type,relatedConcepts.label,relatedConcepts.url,relatedConcepts.source,relatedConcepts.id,univarStats.median,univarStats.mean,univarStats.std,univarStats.min,univarStats.max,univarStats.mode,univarStats.count,univarStats.twentyFifthPercentile,univarStats.seventyFifthPercentile,univarStats.categoricalMarginals.name,univarStats.categoricalMarginals.count +section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,repo_link,standardsMappings.type,standardsMappings.label,standardsMappings.url,standardsMappings.source,standardsMappings.id,relatedConcepts.type,relatedConcepts.label,relatedConcepts.url,relatedConcepts.source,relatedConcepts.id,univarStats.median,univarStats.mean,univarStats.std,univarStats.min,univarStats.max,univarStats.mode,univarStats.count,univarStats.twentyFifthPercentile,univarStats.seventyFifthPercentile,univarStats.categoricalMarginals.name,univarStats.categoricalMarginals.count ,participant_id,,Unique identifier for participant,string,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,race,,Self-reported race,integer,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,age,,What is your age? (age at enrollment),integer,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, From 533a8ca600eb7a08cd9578f1b12b68801ac9e968 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 1 Dec 2023 13:52:47 -0600 Subject: [PATCH 07/72] add gitignore --- .gitignore | 9 +++++++++ 1 file changed, 9 insertions(+) create mode 100644 .gitignore diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..27dfd90 --- /dev/null +++ b/.gitignore @@ -0,0 +1,9 @@ + +#hidden libs and cache dirs +.vscode +.pytest_cache +*/pytest_cache/ +*__pycache__ + +# word docs +*.docx \ No newline at end of file From b3a6e7feb419b9d006a0a648b3975361ec61ac32 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 1 Dec 2023 14:17:08 -0600 Subject: [PATCH 08/72] del `univarStats` (#49) --- .../jsonschema-csvtemplate-fields.html | 2 +- ...onschema-jsontemplate-data-dictionary.html | 4 +- .../jsonschema-csvtemplate-fields.md | 34 ---------- ...jsonschema-jsontemplate-data-dictionary.md | 47 +------------- .../schemas/dictionary/fields.yaml | 38 +---------- .../frictionless/csvtemplate/fields.json | 64 +++---------------- .../jsonschema/csvtemplate/fields.json | 36 +---------- .../schemas/jsonschema/data-dictionary.json | 53 ++------------- .../templates/template_submission.csv | 2 +- .../templates/template_submission.json | 20 +----- 10 files changed, 24 insertions(+), 276 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index f0c5ac6..c80dfb9 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -28,4 +28,4 @@

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: number

Type: number

Type: number

Type: number

Type: number

Type: number

Type: integer

Value must be greater or equal to 0

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 9096cf9..7684eb2 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -1,4 +1,4 @@ - Variable Level Metadata (Data Dictionaries)

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
+ Variable Level Metadata (Data Dictionaries) 

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
 
"Substance use"
 
"Medical History"
@@ -58,4 +58,4 @@
 

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: array of object

Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: object

Univariate statistics inferred from the data about the given variable

Type: integer

Value must be greater or equal to 0

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index d882d75..b7d39f4 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -439,37 +439,3 @@ Examples: **`relatedConcepts.id`** _(string)_ The id locating the individual mapping within the given source. - - -**`univarStats.median`** _(number)_ - - -**`univarStats.mean`** _(number)_ - - -**`univarStats.std`** _(number)_ - - -**`univarStats.min`** _(number)_ - - -**`univarStats.max`** _(number)_ - - -**`univarStats.mode`** _(number)_ - - -**`univarStats.count`** _(integer)_ - - -**`univarStats.twentyFifthPercentile`** _(number)_ - - -**`univarStats.seventyFifthPercentile`** _(number)_ - - -**`univarStats.categoricalMarginals.name`** _(string)_ - - -**`univarStats.categoricalMarginals.count`** _(integer)_ - diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index b9ac8b9..133509f 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -6,6 +6,8 @@ This schema defines the variable level metadata for one data dictionary for a gi ### `description` _(string)_ +### `version` _(string)_ + ### `fields` _(array,required)_ Variable level metadata individual fields integrated into the variable level @@ -355,48 +357,3 @@ readability of the field) that is not a standard false value. It can include one **`relatedConcepts`** _(array)_ Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc) - -**`univarStats`** _(object)_ - Univariate statistics inferred from the data about the given variable - - - -- **`median`** _(number)_ - - - -- **`mean`** _(number)_ - - - -- **`std`** _(number)_ - - - -- **`min`** _(number)_ - - - -- **`max`** _(number)_ - - - -- **`mode`** _(number)_ - - - -- **`count`** _(integer)_ - - - -- **`twentyFifthPercentile`** _(number)_ - - - -- **`seventyFifthPercentile`** _(number)_ - - - -- **`categoricalMarginals`** _(array)_ - - diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 46b32d7..746d546 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -348,8 +348,6 @@ properties: title: Related Concepts description: Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc) - - type: array items: type: object @@ -386,38 +384,4 @@ properties: title: Related Concepts - Id type: string description: | - The id locating the individual mapping within the given source. - univarStats: - type: object - description: | - Univariate statistics inferred from the data about the given variable - - properties: - median: - type: number - mean: - type: number - std: - type: number - min: - type: number - max: - type: number - mode: - type: number - count: - type: integer - minimum: 0 - twentyFifthPercentile: - type: number - seventyFifthPercentile: - type: number - categoricalMarginals: - type: array - items: - type: object - properties: - name: - type: string - count: - type: integer + The id locating the individual mapping within the given source. \ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 504ff84..2ed2d4e 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -56,18 +56,18 @@ "type": "string", "constraints": { "enum": [ - "yearmonth", - "any", - "date", - "string", - "integer", "geopoint", - "duration", - "time", - "year", + "any", "boolean", "number", - "datetime" + "year", + "string", + "date", + "time", + "integer", + "datetime", + "yearmonth", + "duration" ] } }, @@ -247,53 +247,9 @@ }, { "name": "relatedConcepts.id", - "description": "The id locating the individual mapping within the given source.\n", + "description": "The id locating the individual mapping within the given source.", "title": "Related Concepts - Id", "type": "string" - }, - { - "name": "univarStats.median", - "type": "number" - }, - { - "name": "univarStats.mean", - "type": "number" - }, - { - "name": "univarStats.std", - "type": "number" - }, - { - "name": "univarStats.min", - "type": "number" - }, - { - "name": "univarStats.max", - "type": "number" - }, - { - "name": "univarStats.mode", - "type": "number" - }, - { - "name": "univarStats.count", - "type": "integer" - }, - { - "name": "univarStats.twentyFifthPercentile", - "type": "number" - }, - { - "name": "univarStats.seventyFifthPercentile", - "type": "number" - }, - { - "name": "univarStats.categoricalMarginals.name", - "type": "string" - }, - { - "name": "univarStats.categoricalMarginals.count", - "type": "integer" } ], "missingValues": [ diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index 0091256..44d89fd 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -213,41 +213,7 @@ "relatedConcepts.id": { "title": "Related Concepts - Id", "type": "string", - "description": "The id locating the individual mapping within the given source.\n" - }, - "univarStats.median": { - "type": "number" - }, - "univarStats.mean": { - "type": "number" - }, - "univarStats.std": { - "type": "number" - }, - "univarStats.min": { - "type": "number" - }, - "univarStats.max": { - "type": "number" - }, - "univarStats.mode": { - "type": "number" - }, - "univarStats.count": { - "type": "integer", - "minimum": 0 - }, - "univarStats.twentyFifthPercentile": { - "type": "number" - }, - "univarStats.seventyFifthPercentile": { - "type": "number" - }, - "univarStats.categoricalMarginals.name": { - "type": "string" - }, - "univarStats.categoricalMarginals.count": { - "type": "integer" + "description": "The id locating the individual mapping within the given source." } } } \ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 130ede8..afd1dc3 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -16,6 +16,9 @@ "description": { "type": "string" }, + "version": { + "type": "string" + }, "fields": { "type": "array", "items": { @@ -284,55 +287,7 @@ "id": { "title": "Related Concepts - Id", "type": "string", - "description": "The id locating the individual mapping within the given source.\n" - } - } - } - }, - "univarStats": { - "type": "object", - "description": "Univariate statistics inferred from the data about the given variable \n", - "properties": { - "median": { - "type": "number" - }, - "mean": { - "type": "number" - }, - "std": { - "type": "number" - }, - "min": { - "type": "number" - }, - "max": { - "type": "number" - }, - "mode": { - "type": "number" - }, - "count": { - "type": "integer", - "minimum": 0 - }, - "twentyFifthPercentile": { - "type": "number" - }, - "seventyFifthPercentile": { - "type": "number" - }, - "categoricalMarginals": { - "type": "array", - "items": { - "type": "object", - "properties": { - "name": { - "type": "string" - }, - "count": { - "type": "integer" - } - } + "description": "The id locating the individual mapping within the given source." } } } diff --git a/variable-level-metadata-schema/templates/template_submission.csv b/variable-level-metadata-schema/templates/template_submission.csv index 69767dc..fe14ab4 100644 --- a/variable-level-metadata-schema/templates/template_submission.csv +++ b/variable-level-metadata-schema/templates/template_submission.csv @@ -1 +1 @@ -section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,standardsMappings.url,standardsMappings.type,standardsMappings.label,standardsMappings.source,standardsMappings.id,relatedConcepts.url,relatedConcepts.type,relatedConcepts.label,relatedConcepts.source,relatedConcepts.id,univarStats.median,univarStats.mean,univarStats.std,univarStats.min,univarStats.max,univarStats.mode,univarStats.count,univarStats.twentyFifthPercentile,univarStats.seventyFifthPercentile,univarStats.categoricalMarginals.name,univarStats.categoricalMarginals.count \ No newline at end of file +section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,standardsMappings.url,standardsMappings.type,standardsMappings.label,standardsMappings.source,standardsMappings.id,relatedConcepts.url,relatedConcepts.type,relatedConcepts.label,relatedConcepts.source,relatedConcepts.id \ No newline at end of file diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index 9993f6e..bead674 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -2,6 +2,7 @@ { "title": null, "description": null, + "version": null, "fields": [ { "section": null, @@ -41,24 +42,7 @@ "source": null, "id": null } - ], - "univarStats": { - "median": null, - "mean": null, - "std": null, - "min": null, - "max": null, - "mode": null, - "count": null, - "twentyFifthPercentile": null, - "seventyFifthPercentile": null, - "categoricalMarginals": [ - { - "name": null, - "count": null - } - ] - } + ] } ] } From 065b63e92806c884abc401135692b9bbcdf28bc5 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 1 Dec 2023 14:18:01 -0600 Subject: [PATCH 09/72] Add `version` prop in vlmd root json schema (#47) --- .../schemas/dictionary/data-dictionary.yaml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index ef69927..fcdb14a 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -12,6 +12,8 @@ properties: type: string description: type: string + version: # TODO: think about having a version text/message and id (akin to a git commit) + type: string fields: type: array items: From d0a5af47a5c9170c00cc44f2aedd1f0217bdd496 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 4 Dec 2023 09:54:21 -0600 Subject: [PATCH 10/72] encodings to enumLabels and ordered to enumOrdered #44 --- .../jsonschema-csvtemplate-fields.html | 8 ++++---- ...onschema-jsontemplate-data-dictionary.html | 8 ++++---- .../jsonschema-csvtemplate-fields.md | 4 ++-- ...jsonschema-jsontemplate-data-dictionary.md | 4 ++-- .../schemas/dictionary/fields.yaml | 6 +++--- .../frictionless/csvtemplate/fields.json | 20 +++++++++---------- .../jsonschema/csvtemplate/fields.json | 4 ++-- .../schemas/jsonschema/data-dictionary.json | 4 ++-- .../templates/template_submission.csv | 2 +- .../templates/template_submission.json | 4 ++-- 10 files changed, 32 insertions(+), 32 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index c80dfb9..2b29a1b 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -10,9 +10,9 @@
"What is the highest grade or level of school you have completed or the highest degree you have received?"
 

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Definitions:

  • number (A numeric value with optional decimal places. (e.g., 3.14))
  • integer (A whole number without decimal places. (e.g., 42))
  • string (A sequence of characters. (e.g., \"test\"))
  • any (Any type of data is allowed. (e.g., true))
  • boolean (A binary value representing true or false. (e.g., true))
  • date (A specific calendar date. (e.g., \"2023-05-25\"))
  • datetime (A specific date and time, including timezone information. (e.g., \"2023-05-25T10:30:00Z\"))
  • time (A specific time of day. (e.g., \"10:30:00\"))
  • year (A specific year. (e.g., 2023)
  • yearmonth (A specific year and month. (e.g., \"2023-05\"))
  • duration (A length of time. (e.g., \"PT1H\")
  • geopoint (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278]))

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
For example: If type is "string", then see the String formats.
If type is "date", "datetime", or "time", default format is ISO8601 formatting for those respective types (see details on ISO8601 format for Date,
Datetime,
or Time) - If you want to specify a date-like variable using standard Python/C strptime syntax, see here for details.
See here for more information about appropriate format values by variable type.

[Additional information]

Date Formats (date, datetime, time type variable):

A format for a date variable (date,time,datetime).
default: An ISO8601 format string.
any: Any parsable representation of a date/time/datetime. The implementing library can attempt to parse the datetime via a range of strategies.

{PATTERN}: The value can be parsed according to {PATTERN},
which MUST follow the date formatting syntax of
C / Python strftime such as:

  • "%Y-%m-%d (for date, e.g., 2023-05-25)"
  • "%Y%-%d (for date, e.g., 20230525) for date without dashes"
  • "%Y-%m-%dT%H:%M:%S (for datetime, e.g., 2023-05-25T10:30:45)"
  • "%Y-%m-%dT%H:%M:%SZ (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)"
  • "%Y-%m-%dT%H:%M:%S%z (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)"
  • "%Y-%m-%dT%H:%M (for datetime without seconds, e.g., 2023-05-25T10:30)"
  • "%Y-%m-%dT%H (for datetime without minutes and seconds, e.g., 2023-05-25T10)"
  • "%H:%M:%S (for time, e.g., 10:30:45)"
  • "%H:%M:%SZ (for time with UTC timezone, e.g., 10:30:45Z)"
  • "%H:%M:%S%z (for time with timezone offset, e.g., 10:30:45+0300)"

String formats:

  • "email if valid emails (e.g., test@gmail.com)"
  • "uri if valid uri addresses (e.g., https://example.com/resource123)"
  • "binary if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)"
  • "uuid if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)"

Geopoint formats:

The two types of formats for geopoint (describing a geographic point).

  • array (if 'lat,long' (e.g., 36.63,-90.20))
  • object (if {'lat':36.63,'lon':-90.20})

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: string

Constrains possible values to a set of values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"1|2|3|4|5|6|7|8"
 
"White|Black or African American|American Indian or Alaska Native|Native Hawaiian or Other Pacific Islander|Asian|Some other race|Multiracial"
-

Type: string

A regular expression pattern the data MUST conform to.

Type: integer

Specifies the maximum value of a field (e.g., maximum -- or most
recent -- date, maximum integer etc). Note, this is different then
maxLength property.

Type: integer

Specifies the minimum value of a field.

Type: string

Variable value encodings provide a way to further annotate any value within a any variable type,
making values easier to understand.

Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms
only support numerical values. Encodings (and mappings) allow categorical values to be stored as
numerical values.

Additionally, as another use case, this field provides a way to
store categoricals that are stored as "short" labels (such as
abbreviations).

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
Examples:

"0=No|1=Yes"
-
"HW=Hello world|GBW=Good bye world|HM=Hi,Mike"
-

Type: boolean

Indicates whether a categorical variable is ordered. This variable is
relevant for variables that have an ordered relationship but not
necessarily a numerical relationship (e.g., Strongly disagree < Disagree
< Neutral < Agree).

Type: string

A list of missing values specific to a variable.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Missing|Skipped|No preference"
+

Type: string

A regular expression pattern the data MUST conform to.

Type: integer

Specifies the maximum value of a field (e.g., maximum -- or most
recent -- date, maximum integer etc). Note, this is different then
maxLength property.

Type: integer

Specifies the minimum value of a field.

Type: string

Variable value encodings provide a way to further annotate any value within a any variable type,
making values easier to understand.

Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms
only support numerical values. Encodings (and mappings) allow categorical values to be stored as
numerical values.

Additionally, as another use case, this field provides a way to
store categoricals that are stored as "short" labels (such as
abbreviations).

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
Examples:

"0=No|1=Yes"
+
"HW=Hello world|GBW=Good bye world|HM=Hi,Mike"
+

Type: boolean

Indicates whether a categorical variable is ordered. This variable is
relevant for variables that have an ordered relationship but not
necessarily a numerical relationship (e.g., Strongly disagree < Disagree
< Neutral < Agree).

Type: string

A list of missing values specific to a variable.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Missing|Skipped|No preference"
 
"Missing"
 

Type: string

For boolean (true) variable (as defined in type field), this field allows
a physical string representation to be cast as true (increasing
readability of the field). It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Required|REQUIRED"
 
"required|Yes|Y|Checked"
@@ -28,4 +28,4 @@
 

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 7684eb2..76f1263 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -23,16 +23,16 @@ "Some other race", "Multiracial" ] -

Type: string

A regular expression pattern the data MUST conform to.

Type: integer

Specifies the maximum value of a field (e.g., maximum -- or most
recent -- date, maximum integer etc). Note, this is different then
maxLength property.

Type: integer

Specifies the minimum value of a field.

Type: object

Variable value encodings provide a way to further annotate any value within a any variable type,
making values easier to understand.

Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms
only support numerical values. Encodings (and mappings) allow categorical values to be stored as
numerical values.

Additionally, as another use case, this field provides a way to
store categoricals that are stored as "short" labels (such as
abbreviations).


Examples:

{
+

Type: string

A regular expression pattern the data MUST conform to.

Type: integer

Specifies the maximum value of a field (e.g., maximum -- or most
recent -- date, maximum integer etc). Note, this is different then
maxLength property.

Type: integer

Specifies the minimum value of a field.

Type: object

Variable value encodings provide a way to further annotate any value within a any variable type,
making values easier to understand.

Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms
only support numerical values. Encodings (and mappings) allow categorical values to be stored as
numerical values.

Additionally, as another use case, this field provides a way to
store categoricals that are stored as "short" labels (such as
abbreviations).


Examples:

{
     "0": "No",
     "1": "Yes"
 }
-
{
+
{
     "HW": "Hello world",
     "GBW": "Good bye world",
     "HM": "Hi, Mike"
 }
-

Type: boolean

Indicates whether a categorical variable is ordered. This variable is
relevant for variables that have an ordered relationship but not
necessarily a numerical relationship (e.g., Strongly disagree < Disagree
< Neutral < Agree).

Type: array

A list of missing values specific to a variable.


Examples:

[
+

Type: boolean

Indicates whether a categorical variable is ordered. This variable is
relevant for variables that have an ordered relationship but not
necessarily a numerical relationship (e.g., Strongly disagree < Disagree
< Neutral < Agree).

Type: array

A list of missing values specific to a variable.


Examples:

[
     "Missing",
     "Skipped",
     "No preference"
@@ -58,4 +58,4 @@
 

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
 

Type: string

The id locating the individual mapping within the given source.

Type: array of object

Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index b7d39f4..aa1a08a 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -254,7 +254,7 @@ maxLength property. Specifies the minimum value of a field. -**`encodings`** _(string)_ +**`enumLabels`** _(string)_ Variable value encodings provide a way to further annotate any value within a any variable type, making values easier to understand. @@ -280,7 +280,7 @@ Examples: ``` -**`ordered`** _(boolean)_ +**`enumOrdered`** _(boolean)_ Indicates whether a categorical variable is ordered. This variable is relevant for variables that have an ordered relationship but not necessarily a numerical relationship (e.g., Strongly disagree < Disagree diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index 133509f..1f01623 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -273,7 +273,7 @@ The two types of formats for `geopoint` (describing a geographic point). -**`encodings`** _(object)_ +**`enumLabels`** _(object)_ Variable value encodings provide a way to further annotate any value within a any variable type, making values easier to understand. @@ -301,7 +301,7 @@ Examples: ``` -**`ordered`** _(boolean)_ +**`enumOrdered`** _(boolean)_ Indicates whether a categorical variable is ordered. This variable is relevant for variables that have an ordered relationship but not necessarily a numerical relationship (e.g., Strongly disagree < Disagree diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 746d546..624a3ba 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -189,7 +189,7 @@ properties: description: | Specifies the minimum value of a field. - encodingsJsonSpec: + enumLabelsJsonSpec: title: 'Variable Value Encodings (i.e., mappings; value labels)' description: | Variable value encodings provide a way to further annotate any value within a any variable type, @@ -208,7 +208,7 @@ properties: examples: - {"0":"No","1":"Yes"} - {"HW":"Hello world","GBW":"Good bye world","HM":"Hi, Mike"} - encodingsCsvSpec: + enumLabelsCsvSpec: title: 'Variable Value Encodings (i.e., mappings; value labels)' description: | Variable value encodings provide a way to further annotate any value within a any variable type, @@ -227,7 +227,7 @@ properties: examples: - '0=No|1=Yes' - 'HW=Hello world|GBW=Good bye world|HM=Hi,Mike' - ordered: + enumOrdered: title: An ordered variable description: | Indicates whether a categorical variable is ordered. This variable is diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 2ed2d4e..4d1b979 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -56,18 +56,18 @@ "type": "string", "constraints": { "enum": [ - "geopoint", - "any", - "boolean", - "number", "year", - "string", - "date", + "yearmonth", + "boolean", "time", + "date", "integer", + "string", + "geopoint", "datetime", - "yearmonth", - "duration" + "number", + "duration", + "any" ] } }, @@ -115,7 +115,7 @@ "type": "integer" }, { - "name": "encodings", + "name": "enumLabels", "description": "Variable value encodings provide a way to further annotate any value within a any variable type,\nmaking values easier to understand. \n\n\nMany analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms\nonly support numerical values. Encodings (and mappings) allow categorical values to be stored as\nnumerical values.\n\nAdditionally, as another use case, this field provides a way to\nstore categoricals that are stored as \"short\" labels (such as\nabbreviations).\n", "title": "Variable Value Encodings (i.e., mappings; value labels)", "examples": [ @@ -128,7 +128,7 @@ } }, { - "name": "ordered", + "name": "enumOrdered", "description": "Indicates whether a categorical variable is ordered. This variable is\nrelevant for variables that have an ordered relationship but not\nnecessarily a numerical relationship (e.g., Strongly disagree < Disagree\n< Neutral < Agree).\n", "title": "An ordered variable", "type": "boolean" diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index 44d89fd..bb7ae5f 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -99,7 +99,7 @@ "title": "Minimum Value", "description": "Specifies the minimum value of a field.\n" }, - "encodings": { + "enumLabels": { "title": "Variable Value Encodings (i.e., mappings; value labels)", "description": "Variable value encodings provide a way to further annotate any value within a any variable type,\nmaking values easier to understand. \n\n\nMany analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms\nonly support numerical values. Encodings (and mappings) allow categorical values to be stored as\nnumerical values.\n\nAdditionally, as another use case, this field provides a way to\nstore categoricals that are stored as \"short\" labels (such as\nabbreviations).\n", "type": "string", @@ -109,7 +109,7 @@ "HW=Hello world|GBW=Good bye world|HM=Hi,Mike" ] }, - "ordered": { + "enumOrdered": { "title": "An ordered variable", "description": "Indicates whether a categorical variable is ordered. This variable is\nrelevant for variables that have an ordered relationship but not\nnecessarily a numerical relationship (e.g., Strongly disagree < Disagree\n< Neutral < Agree).\n", "type": "boolean" diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index afd1dc3..63aa23d 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -139,7 +139,7 @@ } } }, - "encodings": { + "enumLabels": { "title": "Variable Value Encodings (i.e., mappings; value labels)", "description": "Variable value encodings provide a way to further annotate any value within a any variable type,\nmaking values easier to understand. \n\n\nMany analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms\nonly support numerical values. Encodings (and mappings) allow categorical values to be stored as\nnumerical values.\n\nAdditionally, as another use case, this field provides a way to\nstore categoricals that are stored as \"short\" labels (such as\nabbreviations).\n", "type": "object", @@ -155,7 +155,7 @@ } ] }, - "ordered": { + "enumOrdered": { "title": "An ordered variable", "description": "Indicates whether a categorical variable is ordered. This variable is\nrelevant for variables that have an ordered relationship but not\nnecessarily a numerical relationship (e.g., Strongly disagree < Disagree\n< Neutral < Agree).\n", "type": "boolean" diff --git a/variable-level-metadata-schema/templates/template_submission.csv b/variable-level-metadata-schema/templates/template_submission.csv index fe14ab4..edfb3f7 100644 --- a/variable-level-metadata-schema/templates/template_submission.csv +++ b/variable-level-metadata-schema/templates/template_submission.csv @@ -1 +1 @@ -section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,standardsMappings.url,standardsMappings.type,standardsMappings.label,standardsMappings.source,standardsMappings.id,relatedConcepts.url,relatedConcepts.type,relatedConcepts.label,relatedConcepts.source,relatedConcepts.id \ No newline at end of file +section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings.url,standardsMappings.type,standardsMappings.label,standardsMappings.source,standardsMappings.id,relatedConcepts.url,relatedConcepts.type,relatedConcepts.label,relatedConcepts.source,relatedConcepts.id \ No newline at end of file diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index bead674..bb64186 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -18,8 +18,8 @@ "maximum": null, "minimum": null }, - "encodings": {}, - "ordered": null, + "enumLabels": {}, + "enumOrdered": null, "missingValues": [], "trueValues": [ {} From 3d084a320d8c6fc597e8db99406dbb0dc50dc912 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 4 Dec 2023 11:41:03 -0600 Subject: [PATCH 11/72] Added field standard mapping object with examples from issue #39 --- .../schemas/dictionary/definitions.yaml | 149 ++++++++++++++++++ .../schemas/dictionary/fields.yaml | 54 +------ 2 files changed, 153 insertions(+), 50 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml index 8ac0d07..7d42f21 100644 --- a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml @@ -67,3 +67,152 @@ geopointFormat: - `object` (if {'lat':36.63,'lon':-90.20}) enum: [array,object] + +fieldStandardsMappingObject: + description: | + + A set of instrument and item references to standardized data elements designed to document + the [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements) + and other standardized/common element sources to facilitate cross-study comparison and interoperability + of data. One can either map an individual data element or an instrument in which the field is + a part of. + + __**All Fields Mapped (Both Instrument and Item)**__ + + ```json + "standardsMappings": [ + { + "instrument": { + "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx", + "source": "heal-cde", + "title": "adult-demographics", + "id": + }, + "item": { + "url": "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE", + "source": "CDISC", + "id": "C74457" + } + } + ] + ``` + + __**Only Instrument Title of Form CDE File Mapped**__ + + In this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given. + + ```json + "standardsMappings": [ + { + "instrument": { + "source": "heal-cde", + "title": "adult-demographics" + } + } + ] + ``` + + __**Only Instrument ID of HEAL CDE Mapped**__ + + ```json + "standardsMappings": [ + { + "instrument": { + "source": "heal-cde", + "id": + } + } + ] + ``` + + __**Other Non-HEAL CDE Use Cases**__ + + Only item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the "Identifier" section. Similar to the above, they could also just enter the "url". + + ```json + "standardsMappings": [ + { + "item": { + "source": "NLM", + "id": "Fakc6Jy2x" + } + } + ] + ``` + + __**Multiple CDE Mappings**__ + + Two separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list. + + ```json + "standardsMappings": [ + { + "instrument": { + "source": "heal-cde", + "title": "adult-demographics" + }, + "item": { + "source": "CDISC", + "id": "C74457" + }, + }, + { + "item": { + "source": "NLM", + "id": "Fakc6Jy2x" + } + } + ] + ``` + + instrument: + type: object + title: Standard mapping - instrument + description: | + A standardized set of items which encompass + a variable in this variable level metadata document (if at the root level or the document level) + or the individual variable (if at the field level). + + + !!! note "NOTE" + + If information is present at both the root and the field level, + then the information at the field level would take precedence (i.e., it would cascade). + + properties: + url: + type: string + format: uri + source: + type: string + enum: ["heal-cde"] + title: + type: string + id: + type: string + + item: + type: object + title: Standard mapping - item + description: | + A standardized item (ie field, variable etc) mapped to this individual variable. + properties: + url: + title: Standards Mapping - Url + description: | + The url that links out to the published, standardized mapping. + type: string + format: uri + examples: + - https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI + source: + title: Standard Mapping - Source + description: | + The source of the standardized variable. + type: string + id: + title: Standard Mapping - Id + type: string + description: | + The id locating the individual mapping within the given source. + diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 624a3ba..dfdb818 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -298,56 +298,12 @@ properties: readability of the field) that is not a standard false value. It can include one or more values. $ref: "#/definitions/csvArray" standardsMappings: - title: Standards Mappings - description: A published set of standard variables such as the NIH Common Data Elements program. - type: array - items: - type: object - properties: - url: - title: Standards Mapping - Url - description: | - The url that links out to the published, standardized mapping. - type: string - format: uri - examples: - - https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI - type: - title: Standards Mapping - Title - description: | - The **type** of mapping linked to a published set of standard variables such as the NIH Common Data Elements program - - examples: - - cde - - ontology - - reference_list - type: string - label: - title: Standards Mapping - Label - description: | - A free text **label** of a mapping indicating a mapping(s) to a published set of standard variables such as the NIH Common Data Elements program. - - type: string - examples: - - substance use - - chemical compound - - promis - source: - title: Standard Mapping - Source - description: | - The source of the standardized variable. - type: string - examples: - - TBD (will have controlled vocabulary) - id: - title: Standard Mapping - Id - type: string - description: | - The id locating the individual mapping within the given source. + $ref: "#/definitions/fieldStandardsMappingObject" relatedConcepts: title: Related Concepts - description: Mappings to a published set of concepts related to the given field such as - ontological information (eg., NCI thesaurus, bioportal etc) + description: | + __**[Under development]**__ Mappings to a published set of concepts related to the given field such as + ontological information (eg., NCI thesaurus, bioportal etc) type: array items: type: object @@ -358,8 +314,6 @@ properties: The url that links out to the published, standardized concept. type: string format: uri - examples: - - https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI type: title: Related concepts - Type description: | From ebdc8d3b9027c24a701e4c70c84b521269f7766a Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 4 Dec 2023 14:15:26 -0600 Subject: [PATCH 12/72] Added proposed standardsMappings for root and fields with examples from issue #39 --- .../jsonschema-csvtemplate-fields.html | 14 +- ...onschema-jsontemplate-data-dictionary.html | 73 ++++++++-- .../jsonschema-csvtemplate-fields.md | 70 +++------- ...jsonschema-jsontemplate-data-dictionary.md | 8 +- .../schemas/dictionary/data-dictionary.yaml | 4 + .../schemas/dictionary/definitions.yaml | 128 ++++++++++------- .../schemas/dictionary/fields.yaml | 4 +- .../frictionless/csvtemplate/fields.json | 65 ++++----- .../jsonschema/csvtemplate/fields.json | 50 +++---- .../schemas/jsonschema/data-dictionary.json | 129 +++++++++++------- .../templates/template_submission.csv | 2 +- .../templates/template_submission.json | 26 +++- 12 files changed, 323 insertions(+), 250 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 2b29a1b..725d841 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -18,14 +18,6 @@
"required|Yes|Y|Checked"
 
"Checked"
 
"Required"
-

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
-

Type: string

The type of mapping linked to a published set of standard variables such as the NIH Common Data Elements program


Examples:

"cde"
-
"ontology"
-
"reference_list"
-

Type: string

A free text label of a mapping indicating a mapping(s) to a published set of standard variables such as the NIH Common Data Elements program.


Examples:

"substance use"
-
"chemical compound"
-
"promis"
-

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
-

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$

Type: stringFormat: uri

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
+

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
+

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 76f1263..6721417 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -1,4 +1,4 @@ - Variable Level Metadata (Data Dictionaries)

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
+ Variable Level Metadata (Data Dictionaries) 

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

Type: array of object

Each item of this array must be:

Type: object

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
 
"Substance use"
 
"Medical History"
@@ -48,14 +48,63 @@
 
[
     "required"
 ]
-

Type: array

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Type: array of object

A published set of standard variables such as the NIH Common Data Elements program.

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
-

Type: string

The type of mapping linked to a published set of standard variables such as the NIH Common Data Elements program


Examples:

"cde"
-
"ontology"
-
"reference_list"
-

Type: string

A free text label of a mapping indicating a mapping(s) to a published set of standard variables such as the NIH Common Data Elements program.


Examples:

"substance use"
-
"chemical compound"
-
"promis"
-

Type: string

The source of the standardized variable.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

Type: array of object

Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
-

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: array

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Type: array of object

Each item of this array must be:

Type: object

A set of instrument and item references to standardized data elements designed to document
the HEAL common data elements program
and other standardized/common element sources to facilitate cross-study comparison and interoperability
of data. One can either map an individual data element or an instrument in which the field is
a part of.

*All Fields Mapped (Both Instrument and Item)*

"standardsMappings": [
+    {
+        "instrument": {
+            "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx",
+            "source": "heal-cde",
+            "title": "adult-demographics",
+            "id": <drupal id here>
+        },
+        "item": {
+            "url": "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE",
+            "source": "CDISC",
+            "id": "C74457"
+        }
+    }
+]
+

*Only Instrument Title of Form CDE File Mapped*

In this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.

"standardsMappings": [
+    {
+        "instrument": {
+            "source": "heal-cde",
+            "title": "adult-demographics"
+        }
+    }
+]
+

*Only Instrument ID of HEAL CDE Mapped*

"standardsMappings": [
+    {
+        "instrument": {
+            "source": "heal-cde",
+            "id": <drupal id here>
+        }
+    }
+]
+

*Other Non-HEAL CDE Use Cases*

Only item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the "Identifier" section. Similar to the above, they could also just enter the "url".

"standardsMappings": [
+    {
+        "item": {
+            "source": "NLM",
+            "id": "Fakc6Jy2x"
+        }
+    }
+]
+

*Multiple CDE Mappings*

Two separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.

"standardsMappings": [
+    {
+        "instrument": {
+            "source": "heal-cde",
+            "title": "adult-demographics"
+        },
+        "item": {
+            "source": "CDISC",
+            "id": "C74457"
+        },
+    },
+    {
+        "item": {
+            "source": "NLM",
+            "id": "Fakc6Jy2x"
+        }
+    }
+]
+

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
+

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
+

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index aa1a08a..4057717 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -337,84 +337,48 @@ a physical string representation to be cast as false (increasing readability of the field) that is not a standard false value. It can include one or more values. -**`standardsMappings.url`** _(string)_ - The url that links out to the published, standardized mapping. - -Examples: - - -``` - https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI - -``` - -**`standardsMappings.type`** _(string)_ - The **type** of mapping linked to a published set of standard variables such as the NIH Common Data Elements program - -Examples: +**`standardsMappings.instrument.url`** _(string)_ + +**`standardsMappings.instrument.source`** _(string)_ + +Possible values: -``` - cde +- ``` -``` + heal-cde -``` - ontology + ``` -``` -``` - reference_list +**`standardsMappings.instrument.title`** _(string)_ + -``` +**`standardsMappings.instrument.id`** _(string)_ + -**`standardsMappings.label`** _(string)_ - A free text **label** of a mapping indicating a mapping(s) to a published set of standard variables such as the NIH Common Data Elements program. +**`standardsMappings.item.url`** _(string)_ + The url that links out to the published, standardized mapping. Examples: ``` - substance use - -``` - -``` - chemical compound - -``` - -``` - promis + https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI ``` -**`standardsMappings.source`** _(string)_ +**`standardsMappings.item.source`** _(string)_ The source of the standardized variable. -Examples: - - -``` - TBD (will have controlled vocabulary) - -``` -**`standardsMappings.id`** _(string)_ +**`standardsMappings.item.id`** _(string)_ The id locating the individual mapping within the given source. **`relatedConcepts.url`** _(string)_ The url that links out to the published, standardized concept. -Examples: - - -``` - https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI - -``` **`relatedConcepts.type`** _(string)_ The **type** of mapping to a published set of concepts related to the given field such as diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index 1f01623..53e356e 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -8,6 +8,8 @@ This schema defines the variable level metadata for one data dictionary for a gi ### `version` _(string)_ +### `standardsMappings` _(array)_ + ### `fields` _(array,required)_ Variable level metadata individual fields integrated into the variable level @@ -353,7 +355,9 @@ readability of the field) that is not a standard false value. It can include one **`standardsMappings`** _(array)_ - A published set of standard variables such as the NIH Common Data Elements program. + **`relatedConcepts`** _(array)_ - Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc) + __**[Under development]**__ Mappings to a published set of concepts related to the given field such as +ontological information (eg., NCI thesaurus, bioportal etc) + diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index fcdb14a..0a40cd3 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -14,6 +14,10 @@ properties: type: string version: # TODO: think about having a version text/message and id (akin to a git commit) type: string + standardsMappings: + type: array + items: + $ref: "#/definitions/rootStandardsMappingsItem" fields: type: array items: diff --git a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml index 7d42f21..93c877b 100644 --- a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml @@ -67,8 +67,54 @@ geopointFormat: - `object` (if {'lat':36.63,'lon':-90.20}) enum: [array,object] +standardsMappingsInstrumentObject: + type: object + title: Standard mapping - instrument + description: | + A standardized set of items which encompass + a variable in this variable level metadata document (if at the root level or the document level) + or the individual variable (if at the field level). + + + !!! note "NOTE" + + If information is present at both the root and the field level, + then the information at the field level would take precedence (i.e., it would cascade). + + properties: + url: + type: string + format: uri + source: + type: string + enum: ["heal-cde"] + title: + type: string + id: + type: string -fieldStandardsMappingObject: +rootStandardsMappingsItem: + type: object + description: | + A set of standardized instruments linked to all variables within the `fields` property (but see note). + + !!! note "NOTE" + + If `standardsMappings` is present at both the root (this property) and within `fields`, + then the `fields` `standardsMappings` property takes precedence. + + Note, only instrument can be mapped to this property as opposed to the `fields` `standardsMappings` + This property has the same specification as the `fields` `standardsMappings` to make the cascading logic + easier to understand in the same way other standards implement cascading + (e.g., `missingValues` in the [frictionless specification](https://specs.frictionlessdata.io/patterns/#missing-values-per-field)) + + properties: + instrument: + $ref: "#/definitions/standardsMappingsInstrumentObject" + + +fieldStandardsMappingsItem: + type: object description: | A set of instrument and item references to standardized data elements designed to document @@ -164,55 +210,33 @@ fieldStandardsMappingObject: } ] ``` - - instrument: - type: object - title: Standard mapping - instrument - description: | - A standardized set of items which encompass - a variable in this variable level metadata document (if at the root level or the document level) - or the individual variable (if at the field level). - - - !!! note "NOTE" - - If information is present at both the root and the field level, - then the information at the field level would take precedence (i.e., it would cascade). - - properties: - url: - type: string - format: uri - source: - type: string - enum: ["heal-cde"] - title: - type: string - id: - type: string - - item: - type: object - title: Standard mapping - item - description: | - A standardized item (ie field, variable etc) mapped to this individual variable. - properties: - url: - title: Standards Mapping - Url - description: | - The url that links out to the published, standardized mapping. - type: string - format: uri - examples: - - https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI - source: - title: Standard Mapping - Source - description: | - The source of the standardized variable. - type: string - id: - title: Standard Mapping - Id - type: string - description: | - The id locating the individual mapping within the given source. + properties: + instrument: + $ref: "#/definitions/standardsMappingsInstrumentObject" + + + item: + type: object + title: Standard mapping - item + description: | + A standardized item (ie field, variable etc) mapped to this individual variable. + properties: + url: + title: Standards Mapping - Url + description: | + The url that links out to the published, standardized mapping. + type: string + format: uri + examples: + - https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI + source: + title: Standard Mapping - Source + description: | + The source of the standardized variable. + type: string + id: + title: Standard Mapping - Id + type: string + description: | + The id locating the individual mapping within the given source. diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index dfdb818..ad52081 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -298,7 +298,9 @@ properties: readability of the field) that is not a standard false value. It can include one or more values. $ref: "#/definitions/csvArray" standardsMappings: - $ref: "#/definitions/fieldStandardsMappingObject" + type: array + items: + $ref: "#/definitions/fieldStandardsMappingsItem" relatedConcepts: title: Related Concepts description: | diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 4d1b979..52b58aa 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -56,18 +56,18 @@ "type": "string", "constraints": { "enum": [ + "geopoint", "year", - "yearmonth", - "boolean", - "time", - "date", - "integer", + "number", "string", - "geopoint", + "integer", "datetime", - "number", + "boolean", + "date", + "yearmonth", + "any", "duration", - "any" + "time" ] } }, @@ -170,47 +170,43 @@ } }, { - "name": "standardsMappings.url", - "description": "The url that links out to the published, standardized mapping.\n", - "title": "Standards Mapping - Url", - "examples": [ - "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" - ], + "name": "standardsMappings.instrument.url", "type": "string" }, { - "name": "standardsMappings.type", - "description": "The **type** of mapping linked to a published set of standard variables such as the NIH Common Data Elements program\n", - "title": "Standards Mapping - Title", - "examples": [ - "cde", - "ontology", - "reference_list" - ], + "name": "standardsMappings.instrument.source", + "type": "string", + "constraints": { + "enum": [ + "heal-cde" + ] + } + }, + { + "name": "standardsMappings.instrument.title", + "type": "string" + }, + { + "name": "standardsMappings.instrument.id", "type": "string" }, { - "name": "standardsMappings.label", - "description": "A free text **label** of a mapping indicating a mapping(s) to a published set of standard variables such as the NIH Common Data Elements program.\n", - "title": "Standards Mapping - Label", + "name": "standardsMappings.item.url", + "description": "The url that links out to the published, standardized mapping.\n", + "title": "Standards Mapping - Url", "examples": [ - "substance use", - "chemical compound", - "promis" + "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" ], "type": "string" }, { - "name": "standardsMappings.source", + "name": "standardsMappings.item.source", "description": "The source of the standardized variable.\n", "title": "Standard Mapping - Source", - "examples": [ - "TBD (will have controlled vocabulary)" - ], "type": "string" }, { - "name": "standardsMappings.id", + "name": "standardsMappings.item.id", "description": "The id locating the individual mapping within the given source.\n", "title": "Standard Mapping - Id", "type": "string" @@ -219,9 +215,6 @@ "name": "relatedConcepts.url", "description": "The url that links out to the published, standardized concept.\n", "title": "Related Concepts - Url", - "examples": [ - "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" - ], "type": "string" }, { diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index bb7ae5f..8474d6b 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -141,44 +141,37 @@ "type": "string", "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" }, - "standardsMappings.url": { - "title": "Standards Mapping - Url", - "description": "The url that links out to the published, standardized mapping.\n", + "standardsMappings.instrument.url": { "type": "string", - "format": "uri", - "examples": [ - "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" + "format": "uri" + }, + "standardsMappings.instrument.source": { + "type": "string", + "enum": [ + "heal-cde" ] }, - "standardsMappings.type": { - "title": "Standards Mapping - Title", - "description": "The **type** of mapping linked to a published set of standard variables such as the NIH Common Data Elements program\n", - "examples": [ - "cde", - "ontology", - "reference_list" - ], + "standardsMappings.instrument.title": { "type": "string" }, - "standardsMappings.label": { - "title": "Standards Mapping - Label", - "description": "A free text **label** of a mapping indicating a mapping(s) to a published set of standard variables such as the NIH Common Data Elements program.\n", + "standardsMappings.instrument.id": { + "type": "string" + }, + "standardsMappings.item.url": { + "title": "Standards Mapping - Url", + "description": "The url that links out to the published, standardized mapping.\n", "type": "string", + "format": "uri", "examples": [ - "substance use", - "chemical compound", - "promis" + "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" ] }, - "standardsMappings.source": { + "standardsMappings.item.source": { "title": "Standard Mapping - Source", "description": "The source of the standardized variable.\n", - "type": "string", - "examples": [ - "TBD (will have controlled vocabulary)" - ] + "type": "string" }, - "standardsMappings.id": { + "standardsMappings.item.id": { "title": "Standard Mapping - Id", "type": "string", "description": "The id locating the individual mapping within the given source.\n" @@ -187,10 +180,7 @@ "title": "Related Concepts - Url", "description": "The url that links out to the published, standardized concept.\n", "type": "string", - "format": "uri", - "examples": [ - "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" - ] + "format": "uri" }, "relatedConcepts.type": { "title": "Related concepts - Type", diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 63aa23d..05965fb 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -19,6 +19,38 @@ "version": { "type": "string" }, + "standardsMappings": { + "type": "array", + "items": { + "type": "object", + "description": "A set of standardized instruments linked to all variables within the `fields` property (but see note).\n\n!!! note \"NOTE\"\n\n If `standardsMappings` is present at both the root (this property) and within `fields`, \n then the `fields` `standardsMappings` property takes precedence.\n\n Note, only instrument can be mapped to this property as opposed to the `fields` `standardsMappings`\n This property has the same specification as the `fields` `standardsMappings` to make the cascading logic\n easier to understand in the same way other standards implement cascading \n (e.g., `missingValues` in the [frictionless specification](https://specs.frictionlessdata.io/patterns/#missing-values-per-field))\n", + "properties": { + "instrument": { + "type": "object", + "title": "Standard mapping - instrument", + "description": "A standardized set of items which encompass \na variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n\n\n!!! note \"NOTE\"\n\n If information is present at both the root and the field level, \n then the information at the field level would take precedence (i.e., it would cascade).\n", + "properties": { + "url": { + "type": "string", + "format": "uri" + }, + "source": { + "type": "string", + "enum": [ + "heal-cde" + ] + }, + "title": { + "type": "string" + }, + "id": { + "type": "string" + } + } + } + } + } + }, "fields": { "type": "array", "items": { @@ -199,60 +231,66 @@ "type": "array" }, "standardsMappings": { - "title": "Standards Mappings", - "description": "A published set of standard variables such as the NIH Common Data Elements program.", "type": "array", "items": { "type": "object", + "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", "properties": { - "url": { - "title": "Standards Mapping - Url", - "description": "The url that links out to the published, standardized mapping.\n", - "type": "string", - "format": "uri", - "examples": [ - "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" - ] - }, - "type": { - "title": "Standards Mapping - Title", - "description": "The **type** of mapping linked to a published set of standard variables such as the NIH Common Data Elements program\n", - "examples": [ - "cde", - "ontology", - "reference_list" - ], - "type": "string" - }, - "label": { - "title": "Standards Mapping - Label", - "description": "A free text **label** of a mapping indicating a mapping(s) to a published set of standard variables such as the NIH Common Data Elements program.\n", - "type": "string", - "examples": [ - "substance use", - "chemical compound", - "promis" - ] + "instrument": { + "type": "object", + "title": "Standard mapping - instrument", + "description": "A standardized set of items which encompass \na variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n\n\n!!! note \"NOTE\"\n\n If information is present at both the root and the field level, \n then the information at the field level would take precedence (i.e., it would cascade).\n", + "properties": { + "url": { + "type": "string", + "format": "uri" + }, + "source": { + "type": "string", + "enum": [ + "heal-cde" + ] + }, + "title": { + "type": "string" + }, + "id": { + "type": "string" + } + } }, - "source": { - "title": "Standard Mapping - Source", - "description": "The source of the standardized variable.\n", - "type": "string", - "examples": [ - "TBD (will have controlled vocabulary)" - ] - }, - "id": { - "title": "Standard Mapping - Id", - "type": "string", - "description": "The id locating the individual mapping within the given source.\n" + "item": { + "type": "object", + "title": "Standard mapping - item", + "description": "A standardized item (ie field, variable etc) mapped to this individual variable.\n", + "properties": { + "url": { + "title": "Standards Mapping - Url", + "description": "The url that links out to the published, standardized mapping.\n", + "type": "string", + "format": "uri", + "examples": [ + "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" + ] + }, + "source": { + "title": "Standard Mapping - Source", + "description": "The source of the standardized variable.\n", + "type": "string" + }, + "id": { + "title": "Standard Mapping - Id", + "type": "string", + "description": "The id locating the individual mapping within the given source.\n" + } + } } } } }, "relatedConcepts": { "title": "Related Concepts", - "description": "Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc)", + "description": "__**[Under development]**__ Mappings to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n", "type": "array", "items": { "type": "object", @@ -261,10 +299,7 @@ "title": "Related Concepts - Url", "description": "The url that links out to the published, standardized concept.\n", "type": "string", - "format": "uri", - "examples": [ - "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" - ] + "format": "uri" }, "type": { "title": "Related concepts - Type", diff --git a/variable-level-metadata-schema/templates/template_submission.csv b/variable-level-metadata-schema/templates/template_submission.csv index edfb3f7..dd76b39 100644 --- a/variable-level-metadata-schema/templates/template_submission.csv +++ b/variable-level-metadata-schema/templates/template_submission.csv @@ -1 +1 @@ -section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings.url,standardsMappings.type,standardsMappings.label,standardsMappings.source,standardsMappings.id,relatedConcepts.url,relatedConcepts.type,relatedConcepts.label,relatedConcepts.source,relatedConcepts.id \ No newline at end of file +section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings.instrument.url,standardsMappings.instrument.source,standardsMappings.instrument.title,standardsMappings.instrument.id,standardsMappings.item.url,standardsMappings.item.source,standardsMappings.item.id,relatedConcepts.url,relatedConcepts.type,relatedConcepts.label,relatedConcepts.source,relatedConcepts.id \ No newline at end of file diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index bb64186..59df3a8 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -3,6 +3,16 @@ "title": null, "description": null, "version": null, + "standardsMappings": [ + { + "instrument": { + "url": null, + "source": null, + "title": null, + "id": null + } + } + ], "fields": [ { "section": null, @@ -27,11 +37,17 @@ "falseValues": [], "standardsMappings": [ { - "url": null, - "type": null, - "label": null, - "source": null, - "id": null + "instrument": { + "url": null, + "source": null, + "title": null, + "id": null + }, + "item": { + "url": null, + "source": null, + "id": null + } } ], "relatedConcepts": [ From 1802cba89fecf466ac0b385f7f498e6bc4e07ed6 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 4 Dec 2023 14:27:39 -0600 Subject: [PATCH 13/72] Update vlmd version to 0.2.0 --- VERSIONS.json | 2 +- .../jsonschema-csvtemplate-fields.html | 2 +- ...onschema-jsontemplate-data-dictionary.html | 2 +- .../frictionless/csvtemplate/fields.json | 20 +++++++++---------- .../schemas/jsonschema/data-dictionary.json | 2 +- 5 files changed, 14 insertions(+), 14 deletions(-) diff --git a/VERSIONS.json b/VERSIONS.json index 3fb0226..5427d04 100644 --- a/VERSIONS.json +++ b/VERSIONS.json @@ -1,4 +1,4 @@ { "slmd":"1.0.0", - "vlmd":"0.1.0" + "vlmd":"0.2.0" } \ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 725d841..01798f6 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -20,4 +20,4 @@
"Required"
 

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$

Type: stringFormat: uri

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 6721417..ac153be 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -107,4 +107,4 @@ ]

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 52b58aa..879fc64 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -1,5 +1,5 @@ { - "version": "0.1.0", + "version": "0.2.0", "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"NOTE\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `encodings` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "title": "HEAL Variable Level Metadata Fields", "fields": [ @@ -56,18 +56,18 @@ "type": "string", "constraints": { "enum": [ - "geopoint", - "year", - "number", - "string", - "integer", "datetime", + "geopoint", "boolean", - "date", - "yearmonth", - "any", + "string", + "time", "duration", - "time" + "yearmonth", + "date", + "number", + "integer", + "year", + "any" ] } }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 05965fb..5ec95da 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -1,5 +1,5 @@ { - "version": "0.1.0", + "version": "0.2.0", "$schema": "http://json-schema.org/draft-07/schema#", "$id": "vlmd", "title": "Variable Level Metadata (Data Dictionaries)", From 6f5c0c00ea347d2b8c89b9ad837aa11b75d2b7b7 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Wed, 6 Dec 2023 15:34:27 -0600 Subject: [PATCH 14/72] Updated flattened object within array naming convention --- variable-level-metadata-schema/build.py | 4 +- .../jsonschema-csvtemplate-fields.html | 6 +-- ...onschema-jsontemplate-data-dictionary.html | 2 +- .../jsonschema-csvtemplate-fields.md | 24 +++++------ .../frictionless/csvtemplate/fields.json | 40 +++++++++---------- .../jsonschema/csvtemplate/fields.json | 24 +++++------ .../templates/template_submission.csv | 2 +- 7 files changed, 51 insertions(+), 51 deletions(-) diff --git a/variable-level-metadata-schema/build.py b/variable-level-metadata-schema/build.py index 7e271c7..ee0836b 100644 --- a/variable-level-metadata-schema/build.py +++ b/variable-level-metadata-schema/build.py @@ -110,7 +110,7 @@ def resolve_refs(items, schema, parentkey=False): return schema_resolved -def flatten_properties(properties, parentkey="", sep="."): +def flatten_properties(properties, parentkey="", sep=".",itemsep="[0]"): """ flatten schema properties """ @@ -130,7 +130,7 @@ def flatten_properties(properties, parentkey="", sep="."): properties_flattened.update(newprops) elif items: - newprops = flatten_properties(items,parentkey=flattenedkey) + newprops = flatten_properties(items,parentkey=flattenedkey+itemsep) properties_flattened.update(newprops) else: properties_flattened[flattenedkey] = item diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 01798f6..6611058 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -18,6 +18,6 @@
"required|Yes|Y|Checked"
 
"Checked"
 
"Required"
-

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$

Type: stringFormat: uri

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
-

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$

Type: stringFormat: uri

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
+

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
+

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index ac153be..28e0319 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -107,4 +107,4 @@ ]

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index 4057717..dcec851 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -337,10 +337,10 @@ a physical string representation to be cast as false (increasing readability of the field) that is not a standard false value. It can include one or more values. -**`standardsMappings.instrument.url`** _(string)_ +**`standardsMappings[0].instrument.url`** _(string)_ -**`standardsMappings.instrument.source`** _(string)_ +**`standardsMappings[0].instrument.source`** _(string)_ Possible values: @@ -351,13 +351,13 @@ Possible values: ``` -**`standardsMappings.instrument.title`** _(string)_ +**`standardsMappings[0].instrument.title`** _(string)_ -**`standardsMappings.instrument.id`** _(string)_ +**`standardsMappings[0].instrument.id`** _(string)_ -**`standardsMappings.item.url`** _(string)_ +**`standardsMappings[0].item.url`** _(string)_ The url that links out to the published, standardized mapping. Examples: @@ -368,29 +368,29 @@ Examples: ``` -**`standardsMappings.item.source`** _(string)_ +**`standardsMappings[0].item.source`** _(string)_ The source of the standardized variable. -**`standardsMappings.item.id`** _(string)_ +**`standardsMappings[0].item.id`** _(string)_ The id locating the individual mapping within the given source. -**`relatedConcepts.url`** _(string)_ +**`relatedConcepts[0].url`** _(string)_ The url that links out to the published, standardized concept. -**`relatedConcepts.type`** _(string)_ +**`relatedConcepts[0].type`** _(string)_ The **type** of mapping to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc) -**`relatedConcepts.label`** _(string)_ +**`relatedConcepts[0].label`** _(string)_ A free text **label** of mapping to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc) -**`relatedConcepts.source`** _(string)_ +**`relatedConcepts[0].source`** _(string)_ The source of the related concept. Examples: @@ -401,5 +401,5 @@ Examples: ``` -**`relatedConcepts.id`** _(string)_ +**`relatedConcepts[0].id`** _(string)_ The id locating the individual mapping within the given source. diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 879fc64..1956982 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -56,18 +56,18 @@ "type": "string", "constraints": { "enum": [ - "datetime", "geopoint", - "boolean", - "string", - "time", - "duration", + "number", + "datetime", "yearmonth", + "boolean", "date", - "number", - "integer", + "any", + "duration", "year", - "any" + "integer", + "time", + "string" ] } }, @@ -170,11 +170,11 @@ } }, { - "name": "standardsMappings.instrument.url", + "name": "standardsMappings[0].instrument.url", "type": "string" }, { - "name": "standardsMappings.instrument.source", + "name": "standardsMappings[0].instrument.source", "type": "string", "constraints": { "enum": [ @@ -183,15 +183,15 @@ } }, { - "name": "standardsMappings.instrument.title", + "name": "standardsMappings[0].instrument.title", "type": "string" }, { - "name": "standardsMappings.instrument.id", + "name": "standardsMappings[0].instrument.id", "type": "string" }, { - "name": "standardsMappings.item.url", + "name": "standardsMappings[0].item.url", "description": "The url that links out to the published, standardized mapping.\n", "title": "Standards Mapping - Url", "examples": [ @@ -200,37 +200,37 @@ "type": "string" }, { - "name": "standardsMappings.item.source", + "name": "standardsMappings[0].item.source", "description": "The source of the standardized variable.\n", "title": "Standard Mapping - Source", "type": "string" }, { - "name": "standardsMappings.item.id", + "name": "standardsMappings[0].item.id", "description": "The id locating the individual mapping within the given source.\n", "title": "Standard Mapping - Id", "type": "string" }, { - "name": "relatedConcepts.url", + "name": "relatedConcepts[0].url", "description": "The url that links out to the published, standardized concept.\n", "title": "Related Concepts - Url", "type": "string" }, { - "name": "relatedConcepts.type", + "name": "relatedConcepts[0].type", "description": "The **type** of mapping to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n", "title": "Related concepts - Type", "type": "string" }, { - "name": "relatedConcepts.label", + "name": "relatedConcepts[0].label", "description": "A free text **label** of mapping to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n", "title": "Related Concepts - Label", "type": "string" }, { - "name": "relatedConcepts.source", + "name": "relatedConcepts[0].source", "description": "The source of the related concept.\n", "title": "Related Concepts - Source", "examples": [ @@ -239,7 +239,7 @@ "type": "string" }, { - "name": "relatedConcepts.id", + "name": "relatedConcepts[0].id", "description": "The id locating the individual mapping within the given source.", "title": "Related Concepts - Id", "type": "string" diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index 8474d6b..fe627ba 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -141,23 +141,23 @@ "type": "string", "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" }, - "standardsMappings.instrument.url": { + "standardsMappings[0].instrument.url": { "type": "string", "format": "uri" }, - "standardsMappings.instrument.source": { + "standardsMappings[0].instrument.source": { "type": "string", "enum": [ "heal-cde" ] }, - "standardsMappings.instrument.title": { + "standardsMappings[0].instrument.title": { "type": "string" }, - "standardsMappings.instrument.id": { + "standardsMappings[0].instrument.id": { "type": "string" }, - "standardsMappings.item.url": { + "standardsMappings[0].item.url": { "title": "Standards Mapping - Url", "description": "The url that links out to the published, standardized mapping.\n", "type": "string", @@ -166,33 +166,33 @@ "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" ] }, - "standardsMappings.item.source": { + "standardsMappings[0].item.source": { "title": "Standard Mapping - Source", "description": "The source of the standardized variable.\n", "type": "string" }, - "standardsMappings.item.id": { + "standardsMappings[0].item.id": { "title": "Standard Mapping - Id", "type": "string", "description": "The id locating the individual mapping within the given source.\n" }, - "relatedConcepts.url": { + "relatedConcepts[0].url": { "title": "Related Concepts - Url", "description": "The url that links out to the published, standardized concept.\n", "type": "string", "format": "uri" }, - "relatedConcepts.type": { + "relatedConcepts[0].type": { "title": "Related concepts - Type", "description": "The **type** of mapping to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n", "type": "string" }, - "relatedConcepts.label": { + "relatedConcepts[0].label": { "type": "string", "title": "Related Concepts - Label", "description": "A free text **label** of mapping to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n" }, - "relatedConcepts.source": { + "relatedConcepts[0].source": { "title": "Related Concepts - Source", "description": "The source of the related concept.\n", "type": "string", @@ -200,7 +200,7 @@ "TBD (will have controlled vocabulary)" ] }, - "relatedConcepts.id": { + "relatedConcepts[0].id": { "title": "Related Concepts - Id", "type": "string", "description": "The id locating the individual mapping within the given source." diff --git a/variable-level-metadata-schema/templates/template_submission.csv b/variable-level-metadata-schema/templates/template_submission.csv index dd76b39..d9f78d9 100644 --- a/variable-level-metadata-schema/templates/template_submission.csv +++ b/variable-level-metadata-schema/templates/template_submission.csv @@ -1 +1 @@ -section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings.instrument.url,standardsMappings.instrument.source,standardsMappings.instrument.title,standardsMappings.instrument.id,standardsMappings.item.url,standardsMappings.item.source,standardsMappings.item.id,relatedConcepts.url,relatedConcepts.type,relatedConcepts.label,relatedConcepts.source,relatedConcepts.id \ No newline at end of file +section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id,relatedConcepts[0].url,relatedConcepts[0].type,relatedConcepts[0].label,relatedConcepts[0].source,relatedConcepts[0].id \ No newline at end of file From bbcab92a25f2e954f37fe13f82110babb9d07297 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 8 Dec 2023 11:15:55 -0600 Subject: [PATCH 15/72] Fix: vlmd description not showing for object of arrays --- .../jsonschema-csvtemplate-fields.html | 2 +- ...onschema-jsontemplate-data-dictionary.html | 8 +- ...jsonschema-jsontemplate-data-dictionary.md | 105 ++++++++++++++++++ .../schemas/dictionary/data-dictionary.yaml | 4 +- .../schemas/dictionary/definitions.yaml | 73 ++++++------ .../schemas/dictionary/fields.yaml | 4 +- .../schemas/jsonschema/data-dictionary.json | 6 +- 7 files changed, 153 insertions(+), 49 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 6611058..e8872b5 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -20,4 +20,4 @@
"Required"
 

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$

Type: stringFormat: uri

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 28e0319..f7474d2 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -1,4 +1,4 @@ - Variable Level Metadata (Data Dictionaries)

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

Type: array of object

Each item of this array must be:

Type: object

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
+ Variable Level Metadata (Data Dictionaries) 

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

Type: array

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
 
"Substance use"
 
"Medical History"
@@ -48,7 +48,7 @@
 
[
     "required"
 ]
-

Type: array

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Type: array of object

Each item of this array must be:

Type: object

A set of instrument and item references to standardized data elements designed to document
the HEAL common data elements program
and other standardized/common element sources to facilitate cross-study comparison and interoperability
of data. One can either map an individual data element or an instrument in which the field is
a part of.

*All Fields Mapped (Both Instrument and Item)*

"standardsMappings": [
+

Type: array

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Type: array of object

A set of instrument and item references to standardized data elements designed to document
the HEAL common data elements program
and other standardized/common element sources to facilitate cross-study comparison and interoperability
of data. One can either map an individual data element or an instrument in which the field is
a part of.

*All Fields Mapped (Both Instrument and Item)*

"standardsMappings": [
     {
         "instrument": {
             "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx",
@@ -105,6 +105,6 @@
         }
     }
 ]
-

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
+

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index 53e356e..fd4bf47 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -9,6 +9,17 @@ This schema defines the variable level metadata for one data dictionary for a gi ### `version` _(string)_ ### `standardsMappings` _(array)_ +A set of standardized instruments linked to all variables within the `fields` property (but see note). + +!!! note "NOTE" + + If `standardsMappings` is present at both the root (this property) and within `fields`, + then the `fields` `standardsMappings` property takes precedence. + + Note, only instrument can be mapped to this property as opposed to the `fields` `standardsMappings` + This property has the same specification as the `fields` `standardsMappings` to make the cascading logic + easier to understand in the same way other standards implement cascading + (e.g., `missingValues` in the [frictionless specification](https://specs.frictionlessdata.io/patterns/#missing-values-per-field)) ### `fields` _(array,required)_ @@ -356,6 +367,100 @@ readability of the field) that is not a standard false value. It can include one **`standardsMappings`** _(array)_ +A set of instrument and item references to standardized data elements designed to document +the [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements) +and other standardized/common element sources to facilitate cross-study comparison and interoperability +of data. One can either map an individual data element or an instrument in which the field is +a part of. + +__**All Fields Mapped (Both Instrument and Item)**__ + +```json +"standardsMappings": [ + { + "instrument": { + "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx", + "source": "heal-cde", + "title": "adult-demographics", + "id": + }, + "item": { + "url": "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE", + "source": "CDISC", + "id": "C74457" + } + } +] +``` + +__**Only Instrument Title of Form CDE File Mapped**__ + +In this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given. + +```json +"standardsMappings": [ + { + "instrument": { + "source": "heal-cde", + "title": "adult-demographics" + } + } +] +``` + +__**Only Instrument ID of HEAL CDE Mapped**__ + +```json +"standardsMappings": [ + { + "instrument": { + "source": "heal-cde", + "id": + } + } +] +``` + +__**Other Non-HEAL CDE Use Cases**__ + +Only item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the "Identifier" section. Similar to the above, they could also just enter the "url". + +```json +"standardsMappings": [ + { + "item": { + "source": "NLM", + "id": "Fakc6Jy2x" + } + } +] +``` + +__**Multiple CDE Mappings**__ + +Two separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list. + +```json +"standardsMappings": [ + { + "instrument": { + "source": "heal-cde", + "title": "adult-demographics" + }, + "item": { + "source": "CDISC", + "id": "C74457" + }, + }, + { + "item": { + "source": "NLM", + "id": "Fakc6Jy2x" + } + } +] +``` + **`relatedConcepts`** _(array)_ __**[Under development]**__ Mappings to a published set of concepts related to the given field such as diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index 0a40cd3..982d4df 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -15,9 +15,7 @@ properties: version: # TODO: think about having a version text/message and id (akin to a git commit) type: string standardsMappings: - type: array - items: - $ref: "#/definitions/rootStandardsMappingsItem" + $ref: "#/definitions/rootStandardsMappingsItem" fields: type: array items: diff --git a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml index 93c877b..28c9db4 100644 --- a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml @@ -94,7 +94,7 @@ standardsMappingsInstrumentObject: type: string rootStandardsMappingsItem: - type: object + type: array description: | A set of standardized instruments linked to all variables within the `fields` property (but see note). @@ -107,14 +107,15 @@ rootStandardsMappingsItem: This property has the same specification as the `fields` `standardsMappings` to make the cascading logic easier to understand in the same way other standards implement cascading (e.g., `missingValues` in the [frictionless specification](https://specs.frictionlessdata.io/patterns/#missing-values-per-field)) - - properties: - instrument: - $ref: "#/definitions/standardsMappingsInstrumentObject" + items: + properties: + type: object + instrument: + $ref: "#/definitions/standardsMappingsInstrumentObject" fieldStandardsMappingsItem: - type: object + type: array description: | A set of instrument and item references to standardized data elements designed to document @@ -210,33 +211,35 @@ fieldStandardsMappingsItem: } ] ``` - properties: - instrument: - $ref: "#/definitions/standardsMappingsInstrumentObject" - - - item: - type: object - title: Standard mapping - item - description: | - A standardized item (ie field, variable etc) mapped to this individual variable. - properties: - url: - title: Standards Mapping - Url - description: | - The url that links out to the published, standardized mapping. - type: string - format: uri - examples: - - https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI - source: - title: Standard Mapping - Source - description: | - The source of the standardized variable. - type: string - id: - title: Standard Mapping - Id - type: string - description: | - The id locating the individual mapping within the given source. + items: + type: object + properties: + instrument: + $ref: "#/definitions/standardsMappingsInstrumentObject" + + + item: + type: object + title: Standard mapping - item + description: | + A standardized item (ie field, variable etc) mapped to this individual variable. + properties: + url: + title: Standards Mapping - Url + description: | + The url that links out to the published, standardized mapping. + type: string + format: uri + examples: + - https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI + source: + title: Standard Mapping - Source + description: | + The source of the standardized variable. + type: string + id: + title: Standard Mapping - Id + type: string + description: | + The id locating the individual mapping within the given source. diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index ad52081..4bc50b2 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -298,9 +298,7 @@ properties: readability of the field) that is not a standard false value. It can include one or more values. $ref: "#/definitions/csvArray" standardsMappings: - type: array - items: - $ref: "#/definitions/fieldStandardsMappingsItem" + $ref: "#/definitions/fieldStandardsMappingsItem" relatedConcepts: title: Related Concepts description: | diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 5ec95da..72a080e 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -21,10 +21,10 @@ }, "standardsMappings": { "type": "array", + "description": "A set of standardized instruments linked to all variables within the `fields` property (but see note).\n\n!!! note \"NOTE\"\n\n If `standardsMappings` is present at both the root (this property) and within `fields`, \n then the `fields` `standardsMappings` property takes precedence.\n\n Note, only instrument can be mapped to this property as opposed to the `fields` `standardsMappings`\n This property has the same specification as the `fields` `standardsMappings` to make the cascading logic\n easier to understand in the same way other standards implement cascading \n (e.g., `missingValues` in the [frictionless specification](https://specs.frictionlessdata.io/patterns/#missing-values-per-field))\n", "items": { - "type": "object", - "description": "A set of standardized instruments linked to all variables within the `fields` property (but see note).\n\n!!! note \"NOTE\"\n\n If `standardsMappings` is present at both the root (this property) and within `fields`, \n then the `fields` `standardsMappings` property takes precedence.\n\n Note, only instrument can be mapped to this property as opposed to the `fields` `standardsMappings`\n This property has the same specification as the `fields` `standardsMappings` to make the cascading logic\n easier to understand in the same way other standards implement cascading \n (e.g., `missingValues` in the [frictionless specification](https://specs.frictionlessdata.io/patterns/#missing-values-per-field))\n", "properties": { + "type": "object", "instrument": { "type": "object", "title": "Standard mapping - instrument", @@ -232,9 +232,9 @@ }, "standardsMappings": { "type": "array", + "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", "items": { "type": "object", - "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", "properties": { "instrument": { "type": "object", From 1dd06b2e936cd54d3796f1f4b12d2ccf5be990e4 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Wed, 3 Jan 2024 11:31:52 -0600 Subject: [PATCH 16/72] improve human readable definitions --- .../schemas/dictionary/definitions.yaml | 62 +------------------ .../schemas/dictionary/fields.yaml | 38 ++++-------- 2 files changed, 13 insertions(+), 87 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml index 28c9db4..7854d9c 100644 --- a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml @@ -6,67 +6,7 @@ csvObject: type: string pattern: ^(?:.*?=.*?(?:\||$))+$ - -# for frictionless types and formats see: -# https://specs.frictionlessdata.io/table-schema/#types-and-formats - -# NOTE: The below was excluded from schema to simplify (10/6/2023) and formats is now just type string isntead of anyOf -stringFormat: - title: String Formats - description: | - A format for a specialized type of string of: - - - "`email` if valid emails (e.g., test@gmail.com)" - - "`uri` if valid uri addresses (e.g., https://example.com/resource123)" - - "`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)" - - "`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)" - - enum: - - uri - - email - - binary - - uuid - -dateFormat: - title: Date Formats - type: string - description: | - A format for a date variable (`date`,`time`,`datetime`). - **default**: An ISO8601 format string. - **any**: Any parsable representation of a date/time/datetime. The implementing library can attempt to parse the datetime via a range of strategies. - **{PATTERN}**: The value can be parsed according to `{PATTERN}`, - which `MUST` follow the date formatting syntax of - C / Python [strftime](http://strftime.org/) such as: - - - "`%Y-%m-%d` (for date, e.g., 2023-05-25)" - - "`%Y%-%d` (for date, e.g., 20230525) for date without dashes" - - "`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)" - - "`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)" - - "`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)" - - "`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)" - - "`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)" - - "`%H:%M:%S` (for time, e.g., 10:30:45)" - - "`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)" - - "`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)" - - -geojsonFormat: - title: Geojson Formats - type: string - description: The JSON object according to the geojson spec. - enum: [topojson,default] - -geopointFormat: - title: Geopoint Format - type: string - description: | - The two types of formats for `geopoint` (describing a geographic point). - - - `array` (if 'lat,long' (e.g., 36.63,-90.20)) - - `object` (if {'lat':36.63,'lon':-90.20}) - enum: [array,object] - standardsMappingsInstrumentObject: type: object title: Standard mapping - instrument @@ -166,7 +106,7 @@ fieldStandardsMappingsItem: { "instrument": { "source": "heal-cde", - "id": + "id": \1020 } } ] diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 4bc50b2..6067e87 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -27,25 +27,22 @@ properties: examples: - Demographics - PROMIS - - Substance use - Medical History - - Sleep questions - - Physical activity name: type: string title: Variable Name description: | The name of a variable (i.e., field) as it appears in the data. + examples: + - gender_id title: type: string title: Variable Label (ie Title) description: | - The human-readable title or label of the variable. - - examples: - - My Variable + The human-readable title or label of the variable. + examples: - Gender identity description: type: string @@ -62,8 +59,8 @@ properties: type: string description: | A classification or category of a particular data element or property expected or allowed in the dataset. - - Definitions: + additionalDescription: | + enum definitions: - `number` (A numeric value with optional decimal places. (e.g., 3.14)) - `integer` (A whole number without decimal places. (e.g., 42)) @@ -96,23 +93,12 @@ properties: description: | Indicates the format of the type specified in the `type` property. Each format is dependent on the `type` specified. - For example: If `type` is "string", then see the [String formats](https://specs.frictionlessdata.io/table-schema/#string). - If `type` is "date", "datetime", or "time", default format is ISO8601 formatting for those respective types (see details on ISO8601 format for [Date](https://specs.frictionlessdata.io/table-schema/#date), - [Datetime](https://specs.frictionlessdata.io/table-schema/#datetime), - or [Time](https://specs.frictionlessdata.io/table-schema/#time)) - If you want to specify a date-like variable using standard Python/C strptime syntax, see [here](#format-details-for-date-datetime-time-type-variables) for details. - See [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) for more information about appropriate `format` values by variable `type`. - - [Additional information] + See [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) + for more information about appropriate `format` values by variable `type`. - Date Formats (date, datetime, time `type` variable): - - A format for a date variable (`date`,`time`,`datetime`). - **default**: An ISO8601 format string. - **any**: Any parsable representation of a date/time/datetime. The implementing library can attempt to parse the datetime via a range of strategies. + additionalDescription: | - **{PATTERN}**: The value can be parsed according to `{PATTERN}`, - which `MUST` follow the date formatting syntax of - C / Python [strftime](http://strftime.org/) such as: + Examples of date time pattern formats - "`%Y-%m-%d` (for date, e.g., 2023-05-25)" - "`%Y%-%d` (for date, e.g., 20230525) for date without dashes" @@ -125,7 +111,7 @@ properties: - "`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)" - "`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)" - String formats: + Examples of string formats - "`email` if valid emails (e.g., test@gmail.com)" - "`uri` if valid uri addresses (e.g., https://example.com/resource123)" @@ -133,7 +119,7 @@ properties: - "`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)" - Geopoint formats: + Examples of geopoint formats The two types of formats for `geopoint` (describing a geographic point). From bceffa765e9ef2bfdd2f6aa3c0ded15048182cd0 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Wed, 3 Jan 2024 12:16:05 -0600 Subject: [PATCH 17/72] additional dictionary yaml definition updates --- .../schemas/dictionary/fields.yaml | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 6067e87..56b50a6 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -1,5 +1,3 @@ -"$schema": http://json-schema.org/draft-04/schema# -"$id": vlmd-fields title: HEAL Variable Level Metadata Fields description: | Variable level metadata individual fields integrated into the variable level @@ -266,9 +264,7 @@ properties: examples: - Required|REQUIRED - - required|Yes|Y|Checked - - Checked - - Required + - "Yes" falseValuesJsonSpec: title: Boolean False Value Labels description: | @@ -276,6 +272,9 @@ properties: a physical string representation to be cast as false (increasing readability of the field) that is not a standard false value. It can include one or more values. type: array + examples: + - ["Not required","NOT REQUIRED"] + - ["No"] falseValuesCsvSpec: title: Boolean False Value Labels description: | @@ -283,6 +282,9 @@ properties: a physical string representation to be cast as false (increasing readability of the field) that is not a standard false value. It can include one or more values. $ref: "#/definitions/csvArray" + examples: + - Not required| NOT REQUIRED + - "No" standardsMappings: $ref: "#/definitions/fieldStandardsMappingsItem" relatedConcepts: From 5ca32480df80ce218a82528eb1ea91d25166390f Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Wed, 3 Jan 2024 15:19:05 -0600 Subject: [PATCH 18/72] paired back num examples --- .../frictionless/csvtemplate/fields.json | 37 ++++++++++--------- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 1956982..7bab207 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -10,10 +10,7 @@ "examples": [ "Demographics", "PROMIS", - "Substance use", - "Medical History", - "Sleep questions", - "Physical activity" + "Medical History" ], "type": "string" }, @@ -21,6 +18,9 @@ "name": "name", "description": "The name of a variable (i.e., field) as it appears in the data. \n", "title": "Variable Name", + "examples": [ + "gender_id" + ], "type": "string", "constraints": { "required": true @@ -28,10 +28,9 @@ }, { "name": "title", - "description": "The human-readable title or label of the variable. \n", + "description": "The human-readable title or label of the variable.\n", "title": "Variable Label (ie Title)", "examples": [ - "My Variable", "Gender identity" ], "type": "string" @@ -51,29 +50,29 @@ }, { "name": "type", - "description": "A classification or category of a particular data element or property expected or allowed in the dataset.\n\nDefinitions:\n\n- `number` (A numeric value with optional decimal places. (e.g., 3.14))\n- `integer` (A whole number without decimal places. (e.g., 42))\n- `string` (A sequence of characters. (e.g., \\\"test\\\"))\n- `any` (Any type of data is allowed. (e.g., true))\n- `boolean` (A binary value representing true or false. (e.g., true))\n- `date` (A specific calendar date. (e.g., \\\"2023-05-25\\\"))\n- `datetime` (A specific date and time, including timezone information. (e.g., \\\"2023-05-25T10:30:00Z\\\"))\n- `time` (A specific time of day. (e.g., \\\"10:30:00\\\"))\n- `year` (A specific year. (e.g., 2023)\n- `yearmonth` (A specific year and month. (e.g., \\\"2023-05\\\"))\n- `duration` (A length of time. (e.g., \\\"PT1H\\\")\n- `geopoint` (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278]))\n", + "description": "A classification or category of a particular data element or property expected or allowed in the dataset.\n", "title": "Variable Type", "type": "string", "constraints": { "enum": [ "geopoint", - "number", + "year", + "integer", + "any", + "string", "datetime", - "yearmonth", + "number", "boolean", "date", - "any", + "yearmonth", "duration", - "year", - "integer", - "time", - "string" + "time" ] } }, { "name": "format", - "description": "Indicates the format of the type specified in the `type` property. \nEach format is dependent on the `type` specified. \nFor example: If `type` is \"string\", then see the [String formats](https://specs.frictionlessdata.io/table-schema/#string). \nIf `type` is \"date\", \"datetime\", or \"time\", default format is ISO8601 formatting for those respective types (see details on ISO8601 format for [Date](https://specs.frictionlessdata.io/table-schema/#date),\n[Datetime](https://specs.frictionlessdata.io/table-schema/#datetime), \nor [Time](https://specs.frictionlessdata.io/table-schema/#time)) - If you want to specify a date-like variable using standard Python/C strptime syntax, see [here](#format-details-for-date-datetime-time-type-variables) for details. \nSee [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) for more information about appropriate `format` values by variable `type`. \n\n[Additional information]\n\nDate Formats (date, datetime, time `type` variable):\n\nA format for a date variable (`date`,`time`,`datetime`). \n**default**: An ISO8601 format string.\n**any**: Any parsable representation of a date/time/datetime. The implementing library can attempt to parse the datetime via a range of strategies.\n\n**{PATTERN}**: The value can be parsed according to `{PATTERN}`,\nwhich `MUST` follow the date formatting syntax of \nC / Python [strftime](http://strftime.org/) such as:\n\n- \"`%Y-%m-%d` (for date, e.g., 2023-05-25)\"\n- \"`%Y%-%d` (for date, e.g., 20230525) for date without dashes\"\n- \"`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)\"\n- \"`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)\"\n- \"`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)\"\n- \"`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)\"\n- \"`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)\"\n- \"`%H:%M:%S` (for time, e.g., 10:30:45)\"\n- \"`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)\"\n- \"`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)\"\n\nString formats:\n\n- \"`email` if valid emails (e.g., test@gmail.com)\"\n- \"`uri` if valid uri addresses (e.g., https://example.com/resource123)\"\n- \"`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)\"\n- \"`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)\"\n\n\nGeopoint formats:\n\nThe two types of formats for `geopoint` (describing a geographic point).\n\n- `array` (if 'lat,long' (e.g., 36.63,-90.20))\n- `object` (if {'lat':36.63,'lon':-90.20})\n", + "description": "Indicates the format of the type specified in the `type` property. \nEach format is dependent on the `type` specified. \nSee [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) \nfor more information about appropriate `format` values by variable `type`.\n", "title": "Variable Format", "type": "string" }, @@ -151,9 +150,7 @@ "description": "For boolean (true) variable (as defined in type field), this field allows\na physical string representation to be cast as true (increasing\nreadability of the field). It can include one or more values.\n", "examples": [ "Required|REQUIRED", - "required|Yes|Y|Checked", - "Checked", - "Required" + "Yes" ], "type": "string", "constraints": { @@ -164,6 +161,10 @@ "name": "falseValues", "description": "For boolean (false) variable (as defined in type field), this field allows\na physical string representation to be cast as false (increasing\nreadability of the field) that is not a standard false value. It can include one or more values.\n", "title": "Boolean False Value Labels", + "examples": [ + "Not required| NOT REQUIRED", + "No" + ], "type": "string", "constraints": { "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" From b6b2a60dee5ca3bdc79105277674292ae285e5bc Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Wed, 3 Jan 2024 17:20:43 -0600 Subject: [PATCH 19/72] add schema version --- .../schemas/dictionary/data-dictionary.yaml | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index 982d4df..f9b368c 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -12,8 +12,19 @@ properties: type: string description: type: string + schemaVersion: + type: string + description: | + The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) + + NOTE: This is NOT for versioning of each indiviual data dictionary instance. + Rather, it is the + version of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance + version. + pattern: \d+\.\d+\.\d+ version: # TODO: think about having a version text/message and id (akin to a git commit) type: string + description: The specified individual data dictionary instance version. standardsMappings: $ref: "#/definitions/rootStandardsMappingsItem" fields: From 00f6f6263a21beb51db71e728a327ad5085c1834 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Wed, 3 Jan 2024 17:21:00 -0600 Subject: [PATCH 20/72] minor formatting --- .../schemas/dictionary/fields.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 56b50a6..ffb304a 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -3,13 +3,12 @@ description: | Variable level metadata individual fields integrated into the variable level metadata object within the HEAL platform metadata service. - !!! note "NOTE" + !!! note "Highly encouraged" Only `name` and `description` properties are required. - For categorical variables, `constraints.enum` and `encodings` (where applicable) properties are highly encouraged. + For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged. `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables) - type: object required: - name @@ -95,6 +94,7 @@ properties: for more information about appropriate `format` values by variable `type`. additionalDescription: | + examples/definitions of patterns and possible values: Examples of date time pattern formats From f96bc320b346b7d286720dd81953f6b4b441156d Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Wed, 3 Jan 2024 17:21:21 -0600 Subject: [PATCH 21/72] build with new dictionary updates --- .../docs/assets/templates/csvtemplate.md | 11 +- .../docs/assets/templates/jsontemplate.md | 16 +- .../docs/assets/templates/properties.md | 2 +- .../jsonschema-csvtemplate-fields.html | 23 +- ...onschema-jsontemplate-data-dictionary.html | 26 +- .../jsonschema-csvtemplate-fields.md | 233 ++++++----------- ...jsonschema-jsontemplate-data-dictionary.md | 239 +++++++----------- .../frictionless/csvtemplate/fields.json | 18 +- .../jsonschema/csvtemplate/fields.json | 33 +-- .../schemas/jsonschema/data-dictionary.json | 44 ++-- .../templates/template_submission.json | 1 + 11 files changed, 266 insertions(+), 380 deletions(-) diff --git a/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md b/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md index 306253c..087bdeb 100644 --- a/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md +++ b/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md @@ -1,4 +1,4 @@ -# {{ schema.title }} +# {{ schema.title }} _version {{ schema.version }}_ {{ schema.description }} @@ -6,4 +6,13 @@ {% for itemname,item in schema.properties.items() %} {% include 'properties.md' %} +{% endfor %} + + +# End of schema - Additional Property information + +{% for itemname,item in schema['properties'].items() %} +{% if 'additionalDescription' in item %} +## `{{ itemname }}` {{ item.additionalDescription }} +{% endif %} {% endfor %} \ No newline at end of file diff --git a/variable-level-metadata-schema/docs/assets/templates/jsontemplate.md b/variable-level-metadata-schema/docs/assets/templates/jsontemplate.md index 56c706e..f9ba9a9 100644 --- a/variable-level-metadata-schema/docs/assets/templates/jsontemplate.md +++ b/variable-level-metadata-schema/docs/assets/templates/jsontemplate.md @@ -1,16 +1,24 @@ -# {{ schema.title }} +# {{ schema.title }} _version {{ schema.version }}_ {{ schema.description }} {% for itemname,item in schema.properties.items() %} -### `{{ itemname }}` _({{ item.type }}{{ ',required' if itemname in schema.required }})_ +## `{{ itemname }}` _({{ item.type }}{{ ',required' if itemname in schema.required }})_ {{ item.description }} {% if itemname == 'fields' %} {{ item['items']['description'] }} -#### Properties for each record +### Properties for each `fields` record {% set schema = item['items'] %} {% for itemname,item in item['items']['properties'].items() %} {% include 'properties.md' %} {% endfor %} {% endif %} -{% endfor %} \ No newline at end of file +{% endfor %} + +### Additional `fields` property information + +{% for itemname,item in schema["properties"]["fields"]["items"]["properties"].items() %} +{% if 'additionalDescription' in item %} +#### `{{ itemname }}` {{ item.additionalDescription }} +{% endif %} +{% endfor %} diff --git a/variable-level-metadata-schema/docs/assets/templates/properties.md b/variable-level-metadata-schema/docs/assets/templates/properties.md index 0ee682c..f3ca54c 100644 --- a/variable-level-metadata-schema/docs/assets/templates/properties.md +++ b/variable-level-metadata-schema/docs/assets/templates/properties.md @@ -34,7 +34,7 @@ __{{ item.title }}__ {{ itemtype }} {{ item.description }} {# #} {# #} {% if item.enum is defined %} -{{ render_type_item('Possible values',item.enum)}} +Must be one of: {{ "`" + "`, `".join(item.enum) + "`" }} {% endif %} {# #} {# #} diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index e8872b5..66b0424 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -1,23 +1,20 @@ - HEAL Variable Level Metadata Fields

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
+ HEAL Variable Level Metadata Fields 

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
-
"Substance use"
-
"Medical History"
-
"Sleep questions"
-
"Physical activity"
-

Type: string

The name of a variable (i.e., field) as it appears in the data.

Type: string

The human-readable title or label of the variable.


Examples:

"My Variable"
-
"Gender identity"
+
"Medical History"
+

Type: string

The name of a variable (i.e., field) as it appears in the data.


Example:

"gender_id"
+

Type: string

The human-readable title or label of the variable.


Example:

"Gender identity"
 

Type: string

An extended description of the variable. This could be the definition of a variable or the
question text (e.g., if a survey).


Examples:

"The participant's age at the time of study enrollment"
 
"What is the highest grade or level of school you have completed or the highest degree you have received?"
-

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Definitions:

  • number (A numeric value with optional decimal places. (e.g., 3.14))
  • integer (A whole number without decimal places. (e.g., 42))
  • string (A sequence of characters. (e.g., \"test\"))
  • any (Any type of data is allowed. (e.g., true))
  • boolean (A binary value representing true or false. (e.g., true))
  • date (A specific calendar date. (e.g., \"2023-05-25\"))
  • datetime (A specific date and time, including timezone information. (e.g., \"2023-05-25T10:30:00Z\"))
  • time (A specific time of day. (e.g., \"10:30:00\"))
  • year (A specific year. (e.g., 2023)
  • yearmonth (A specific year and month. (e.g., \"2023-05\"))
  • duration (A length of time. (e.g., \"PT1H\")
  • geopoint (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278]))

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
For example: If type is "string", then see the String formats.
If type is "date", "datetime", or "time", default format is ISO8601 formatting for those respective types (see details on ISO8601 format for Date,
Datetime,
or Time) - If you want to specify a date-like variable using standard Python/C strptime syntax, see here for details.
See here for more information about appropriate format values by variable type.

[Additional information]

Date Formats (date, datetime, time type variable):

A format for a date variable (date,time,datetime).
default: An ISO8601 format string.
any: Any parsable representation of a date/time/datetime. The implementing library can attempt to parse the datetime via a range of strategies.

{PATTERN}: The value can be parsed according to {PATTERN},
which MUST follow the date formatting syntax of
C / Python strftime such as:

  • "%Y-%m-%d (for date, e.g., 2023-05-25)"
  • "%Y%-%d (for date, e.g., 20230525) for date without dashes"
  • "%Y-%m-%dT%H:%M:%S (for datetime, e.g., 2023-05-25T10:30:45)"
  • "%Y-%m-%dT%H:%M:%SZ (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)"
  • "%Y-%m-%dT%H:%M:%S%z (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)"
  • "%Y-%m-%dT%H:%M (for datetime without seconds, e.g., 2023-05-25T10:30)"
  • "%Y-%m-%dT%H (for datetime without minutes and seconds, e.g., 2023-05-25T10)"
  • "%H:%M:%S (for time, e.g., 10:30:45)"
  • "%H:%M:%SZ (for time with UTC timezone, e.g., 10:30:45Z)"
  • "%H:%M:%S%z (for time with timezone offset, e.g., 10:30:45+0300)"

String formats:

  • "email if valid emails (e.g., test@gmail.com)"
  • "uri if valid uri addresses (e.g., https://example.com/resource123)"
  • "binary if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)"
  • "uuid if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)"

Geopoint formats:

The two types of formats for geopoint (describing a geographic point).

  • array (if 'lat,long' (e.g., 36.63,-90.20))
  • object (if {'lat':36.63,'lon':-90.20})

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: string

Constrains possible values to a set of values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"1|2|3|4|5|6|7|8"
+

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
See here
for more information about appropriate format values by variable type.

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: string

Constrains possible values to a set of values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"1|2|3|4|5|6|7|8"
 
"White|Black or African American|American Indian or Alaska Native|Native Hawaiian or Other Pacific Islander|Asian|Some other race|Multiracial"
 

Type: string

A regular expression pattern the data MUST conform to.

Type: integer

Specifies the maximum value of a field (e.g., maximum -- or most
recent -- date, maximum integer etc). Note, this is different then
maxLength property.

Type: integer

Specifies the minimum value of a field.

Type: string

Variable value encodings provide a way to further annotate any value within a any variable type,
making values easier to understand.

Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms
only support numerical values. Encodings (and mappings) allow categorical values to be stored as
numerical values.

Additionally, as another use case, this field provides a way to
store categoricals that are stored as "short" labels (such as
abbreviations).

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
Examples:

"0=No|1=Yes"
 
"HW=Hello world|GBW=Good bye world|HM=Hi,Mike"
 

Type: boolean

Indicates whether a categorical variable is ordered. This variable is
relevant for variables that have an ordered relationship but not
necessarily a numerical relationship (e.g., Strongly disagree < Disagree
< Neutral < Agree).

Type: string

A list of missing values specific to a variable.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Missing|Skipped|No preference"
 
"Missing"
 

Type: string

For boolean (true) variable (as defined in type field), this field allows
a physical string representation to be cast as true (increasing
readability of the field). It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Required|REQUIRED"
-
"required|Yes|Y|Checked"
-
"Checked"
-
"Required"
-

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$

Type: stringFormat: uri

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
+
"Yes"
+

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Not required| NOT REQUIRED"
+
"No"
+

Type: stringFormat: uri

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index f7474d2..b24cff2 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -1,14 +1,11 @@ - Variable Level Metadata (Data Dictionaries)

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

Type: array

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "NOTE"

Only name and description properties are required.
For categorical variables, constraints.enum and encodings (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
+ Variable Level Metadata (Data Dictionaries) 

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

Must match regular expression: \d+\.\d+\.\d+

Type: string

The specified individual data dictionary instance version.

Type: array

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
-
"Substance use"
-
"Medical History"
-
"Sleep questions"
-
"Physical activity"
-

Type: string

The name of a variable (i.e., field) as it appears in the data.

Type: string

The human-readable title or label of the variable.


Examples:

"My Variable"
-
"Gender identity"
+
"Medical History"
+

Type: string

The name of a variable (i.e., field) as it appears in the data.


Example:

"gender_id"
+

Type: string

The human-readable title or label of the variable.


Example:

"Gender identity"
 

Type: string

An extended description of the variable. This could be the definition of a variable or the
question text (e.g., if a survey).


Examples:

"The participant's age at the time of study enrollment"
 
"What is the highest grade or level of school you have completed or the highest degree you have received?"
-

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Definitions:

  • number (A numeric value with optional decimal places. (e.g., 3.14))
  • integer (A whole number without decimal places. (e.g., 42))
  • string (A sequence of characters. (e.g., \"test\"))
  • any (Any type of data is allowed. (e.g., true))
  • boolean (A binary value representing true or false. (e.g., true))
  • date (A specific calendar date. (e.g., \"2023-05-25\"))
  • datetime (A specific date and time, including timezone information. (e.g., \"2023-05-25T10:30:00Z\"))
  • time (A specific time of day. (e.g., \"10:30:00\"))
  • year (A specific year. (e.g., 2023)
  • yearmonth (A specific year and month. (e.g., \"2023-05\"))
  • duration (A length of time. (e.g., \"PT1H\")
  • geopoint (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278]))

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
For example: If type is "string", then see the String formats.
If type is "date", "datetime", or "time", default format is ISO8601 formatting for those respective types (see details on ISO8601 format for Date,
Datetime,
or Time) - If you want to specify a date-like variable using standard Python/C strptime syntax, see here for details.
See here for more information about appropriate format values by variable type.

[Additional information]

Date Formats (date, datetime, time type variable):

A format for a date variable (date,time,datetime).
default: An ISO8601 format string.
any: Any parsable representation of a date/time/datetime. The implementing library can attempt to parse the datetime via a range of strategies.

{PATTERN}: The value can be parsed according to {PATTERN},
which MUST follow the date formatting syntax of
C / Python strftime such as:

  • "%Y-%m-%d (for date, e.g., 2023-05-25)"
  • "%Y%-%d (for date, e.g., 20230525) for date without dashes"
  • "%Y-%m-%dT%H:%M:%S (for datetime, e.g., 2023-05-25T10:30:45)"
  • "%Y-%m-%dT%H:%M:%SZ (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)"
  • "%Y-%m-%dT%H:%M:%S%z (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)"
  • "%Y-%m-%dT%H:%M (for datetime without seconds, e.g., 2023-05-25T10:30)"
  • "%Y-%m-%dT%H (for datetime without minutes and seconds, e.g., 2023-05-25T10)"
  • "%H:%M:%S (for time, e.g., 10:30:45)"
  • "%H:%M:%SZ (for time with UTC timezone, e.g., 10:30:45Z)"
  • "%H:%M:%S%z (for time with timezone offset, e.g., 10:30:45+0300)"

String formats:

  • "email if valid emails (e.g., test@gmail.com)"
  • "uri if valid uri addresses (e.g., https://example.com/resource123)"
  • "binary if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)"
  • "uuid if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)"

Geopoint formats:

The two types of formats for geopoint (describing a geographic point).

  • array (if 'lat,long' (e.g., 36.63,-90.20))
  • object (if {'lat':36.63,'lon':-90.20})

Type: object

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: array

Constrains possible values to a set of values.


Examples:

[
+

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
See here
for more information about appropriate format values by variable type.

Type: object

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: array

Constrains possible values to a set of values.


Examples:

[
     1,
     2,
     3,
@@ -48,7 +45,14 @@
 
[
     "required"
 ]
-

Type: array

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Type: array of object

A set of instrument and item references to standardized data elements designed to document
the HEAL common data elements program
and other standardized/common element sources to facilitate cross-study comparison and interoperability
of data. One can either map an individual data element or an instrument in which the field is
a part of.

*All Fields Mapped (Both Instrument and Item)*

"standardsMappings": [
+

Type: array

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.


Examples:

[
+    "Not required",
+    "NOT REQUIRED"
+]
+
[
+    "No"
+]
+

Type: array of object

A set of instrument and item references to standardized data elements designed to document
the HEAL common data elements program
and other standardized/common element sources to facilitate cross-study comparison and interoperability
of data. One can either map an individual data element or an instrument in which the field is
a part of.

*All Fields Mapped (Both Instrument and Item)*

"standardsMappings": [
     {
         "instrument": {
             "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx",
@@ -75,7 +79,7 @@
     {
         "instrument": {
             "source": "heal-cde",
-            "id": <drupal id here>
+            "id": \1020
         }
     }
 ]
@@ -107,4 +111,4 @@
 ]
 

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index dcec851..7cbd151 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -1,12 +1,12 @@ -# HEAL Variable Level Metadata Fields +# HEAL Variable Level Metadata Fields _version _ Variable level metadata individual fields integrated into the variable level metadata object within the HEAL platform metadata service. -!!! note "NOTE" +!!! note "Highly encouraged" Only `name` and `description` properties are required. - For categorical variables, `constraints.enum` and `encodings` (where applicable) properties are highly encouraged. + For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged. `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables) @@ -31,41 +31,28 @@ Examples: ``` -``` - Substance use - -``` - ``` Medical History ``` -``` - Sleep questions +**`name`** _(string,required)_ + The name of a variable (i.e., field) as it appears in the data. -``` +Examples: -``` - Physical activity ``` + gender_id -**`name`** _(string,required)_ - The name of a variable (i.e., field) as it appears in the data. - +``` **`title`** _(string)_ - The human-readable title or label of the variable. + The human-readable title or label of the variable. Examples: -``` - My Variable - -``` - ``` Gender identity @@ -91,131 +78,13 @@ Examples: **`type`** _(string)_ A classification or category of a particular data element or property expected or allowed in the dataset. -Definitions: - -- `number` (A numeric value with optional decimal places. (e.g., 3.14)) -- `integer` (A whole number without decimal places. (e.g., 42)) -- `string` (A sequence of characters. (e.g., \"test\")) -- `any` (Any type of data is allowed. (e.g., true)) -- `boolean` (A binary value representing true or false. (e.g., true)) -- `date` (A specific calendar date. (e.g., \"2023-05-25\")) -- `datetime` (A specific date and time, including timezone information. (e.g., \"2023-05-25T10:30:00Z\")) -- `time` (A specific time of day. (e.g., \"10:30:00\")) -- `year` (A specific year. (e.g., 2023) -- `yearmonth` (A specific year and month. (e.g., \"2023-05\")) -- `duration` (A length of time. (e.g., \"PT1H\") -- `geopoint` (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278])) - -Possible values: - -- ``` - - number - - ``` -- ``` - - integer - - ``` -- ``` - - string - - ``` -- ``` - - any - - ``` -- ``` - - boolean - - ``` -- ``` - - date - - ``` -- ``` - - datetime - - ``` -- ``` - - time - - ``` -- ``` - - year - - ``` -- ``` - - yearmonth - - ``` -- ``` - - duration - - ``` -- ``` - - geopoint - - ``` - +Must be one of: `number`, `integer`, `string`, `any`, `boolean`, `date`, `datetime`, `time`, `year`, `yearmonth`, `duration`, `geopoint` **`format`** _(string)_ Indicates the format of the type specified in the `type` property. Each format is dependent on the `type` specified. -For example: If `type` is "string", then see the [String formats](https://specs.frictionlessdata.io/table-schema/#string). -If `type` is "date", "datetime", or "time", default format is ISO8601 formatting for those respective types (see details on ISO8601 format for [Date](https://specs.frictionlessdata.io/table-schema/#date), -[Datetime](https://specs.frictionlessdata.io/table-schema/#datetime), -or [Time](https://specs.frictionlessdata.io/table-schema/#time)) - If you want to specify a date-like variable using standard Python/C strptime syntax, see [here](#format-details-for-date-datetime-time-type-variables) for details. -See [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) for more information about appropriate `format` values by variable `type`. - -[Additional information] - -Date Formats (date, datetime, time `type` variable): - -A format for a date variable (`date`,`time`,`datetime`). -**default**: An ISO8601 format string. -**any**: Any parsable representation of a date/time/datetime. The implementing library can attempt to parse the datetime via a range of strategies. - -**{PATTERN}**: The value can be parsed according to `{PATTERN}`, -which `MUST` follow the date formatting syntax of -C / Python [strftime](http://strftime.org/) such as: - -- "`%Y-%m-%d` (for date, e.g., 2023-05-25)" -- "`%Y%-%d` (for date, e.g., 20230525) for date without dashes" -- "`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)" -- "`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)" -- "`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)" -- "`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)" -- "`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)" -- "`%H:%M:%S` (for time, e.g., 10:30:45)" -- "`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)" -- "`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)" - -String formats: - -- "`email` if valid emails (e.g., test@gmail.com)" -- "`uri` if valid uri addresses (e.g., https://example.com/resource123)" -- "`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)" -- "`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)" - - -Geopoint formats: - -The two types of formats for `geopoint` (describing a geographic point). - -- `array` (if 'lat,long' (e.g., 36.63,-90.20)) -- `object` (if {'lat':36.63,'lon':-90.20}) +See [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) +for more information about appropriate `format` values by variable `type`. **`constraints.maxLength`** _(integer)_ @@ -317,39 +186,34 @@ Examples: ``` ``` - required|Yes|Y|Checked + Yes ``` -``` - Checked +**`falseValues`** _(string)_ + For boolean (false) variable (as defined in type field), this field allows +a physical string representation to be cast as false (increasing +readability of the field) that is not a standard false value. It can include one or more values. + +Examples: -``` ``` - Required + Not required| NOT REQUIRED ``` -**`falseValues`** _(string)_ - For boolean (false) variable (as defined in type field), this field allows -a physical string representation to be cast as false (increasing -readability of the field) that is not a standard false value. It can include one or more values. +``` + No +``` **`standardsMappings[0].instrument.url`** _(string)_ **`standardsMappings[0].instrument.source`** _(string)_ -Possible values: - -- ``` - - heal-cde - - ``` - +Must be one of: `heal-cde` **`standardsMappings[0].instrument.title`** _(string)_ @@ -403,3 +267,52 @@ Examples: **`relatedConcepts[0].id`** _(string)_ The id locating the individual mapping within the given source. + + +# End of schema - Additional Property information + +## `type` enum definitions: + +- `number` (A numeric value with optional decimal places. (e.g., 3.14)) +- `integer` (A whole number without decimal places. (e.g., 42)) +- `string` (A sequence of characters. (e.g., \"test\")) +- `any` (Any type of data is allowed. (e.g., true)) +- `boolean` (A binary value representing true or false. (e.g., true)) +- `date` (A specific calendar date. (e.g., \"2023-05-25\")) +- `datetime` (A specific date and time, including timezone information. (e.g., \"2023-05-25T10:30:00Z\")) +- `time` (A specific time of day. (e.g., \"10:30:00\")) +- `year` (A specific year. (e.g., 2023) +- `yearmonth` (A specific year and month. (e.g., \"2023-05\")) +- `duration` (A length of time. (e.g., \"PT1H\") +- `geopoint` (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278])) + +## `format` examples/definitions of patterns and possible values: + +Examples of date time pattern formats + +- "`%Y-%m-%d` (for date, e.g., 2023-05-25)" +- "`%Y%-%d` (for date, e.g., 20230525) for date without dashes" +- "`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)" +- "`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)" +- "`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)" +- "`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)" +- "`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)" +- "`%H:%M:%S` (for time, e.g., 10:30:45)" +- "`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)" +- "`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)" + +Examples of string formats + +- "`email` if valid emails (e.g., test@gmail.com)" +- "`uri` if valid uri addresses (e.g., https://example.com/resource123)" +- "`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)" +- "`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)" + + +Examples of geopoint formats + +The two types of formats for `geopoint` (describing a geographic point). + +- `array` (if 'lat,long' (e.g., 36.63,-90.20)) +- `object` (if {'lat':36.63,'lon':-90.20}) + diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index fd4bf47..455be18 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -1,14 +1,22 @@ -# Variable Level Metadata (Data Dictionaries) +# Variable Level Metadata (Data Dictionaries) _version 0.2.0_ This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries -### `title` _(string,required)_ +## `title` _(string,required)_ -### `description` _(string)_ +## `description` _(string)_ -### `version` _(string)_ +## `schemaVersion` _(string)_ +The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) -### `standardsMappings` _(array)_ +NOTE: This is NOT for versioning of each indiviual data dictionary instance. +Rather, it is the +version of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance +version. + +## `version` _(string)_ +The specified individual data dictionary instance version. +## `standardsMappings` _(array)_ A set of standardized instruments linked to all variables within the `fields` property (but see note). !!! note "NOTE" @@ -21,19 +29,19 @@ A set of standardized instruments linked to all variables within the `fields` pr easier to understand in the same way other standards implement cascading (e.g., `missingValues` in the [frictionless specification](https://specs.frictionlessdata.io/patterns/#missing-values-per-field)) -### `fields` _(array,required)_ +## `fields` _(array,required)_ Variable level metadata individual fields integrated into the variable level metadata object within the HEAL platform metadata service. -!!! note "NOTE" +!!! note "Highly encouraged" Only `name` and `description` properties are required. - For categorical variables, `constraints.enum` and `encodings` (where applicable) properties are highly encouraged. + For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged. `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables) -#### Properties for each record +### Properties for each `fields` record **`section`** _(string)_ The section, form, survey instrument, set of measures or other broad category used @@ -52,41 +60,28 @@ Examples: ``` -``` - Substance use - -``` - ``` Medical History ``` -``` - Sleep questions +**`name`** _(string,required)_ + The name of a variable (i.e., field) as it appears in the data. -``` +Examples: -``` - Physical activity ``` + gender_id -**`name`** _(string,required)_ - The name of a variable (i.e., field) as it appears in the data. - +``` **`title`** _(string)_ - The human-readable title or label of the variable. + The human-readable title or label of the variable. Examples: -``` - My Variable - -``` - ``` Gender identity @@ -112,131 +107,13 @@ Examples: **`type`** _(string)_ A classification or category of a particular data element or property expected or allowed in the dataset. -Definitions: - -- `number` (A numeric value with optional decimal places. (e.g., 3.14)) -- `integer` (A whole number without decimal places. (e.g., 42)) -- `string` (A sequence of characters. (e.g., \"test\")) -- `any` (Any type of data is allowed. (e.g., true)) -- `boolean` (A binary value representing true or false. (e.g., true)) -- `date` (A specific calendar date. (e.g., \"2023-05-25\")) -- `datetime` (A specific date and time, including timezone information. (e.g., \"2023-05-25T10:30:00Z\")) -- `time` (A specific time of day. (e.g., \"10:30:00\")) -- `year` (A specific year. (e.g., 2023) -- `yearmonth` (A specific year and month. (e.g., \"2023-05\")) -- `duration` (A length of time. (e.g., \"PT1H\") -- `geopoint` (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278])) - -Possible values: - -- ``` - - number - - ``` -- ``` - - integer - - ``` -- ``` - - string - - ``` -- ``` - - any - - ``` -- ``` - - boolean - - ``` -- ``` - - date - - ``` -- ``` - - datetime - - ``` -- ``` - - time - - ``` -- ``` - - year - - ``` -- ``` - - yearmonth - - ``` -- ``` - - duration - - ``` -- ``` - - geopoint - - ``` - +Must be one of: `number`, `integer`, `string`, `any`, `boolean`, `date`, `datetime`, `time`, `year`, `yearmonth`, `duration`, `geopoint` **`format`** _(string)_ Indicates the format of the type specified in the `type` property. Each format is dependent on the `type` specified. -For example: If `type` is "string", then see the [String formats](https://specs.frictionlessdata.io/table-schema/#string). -If `type` is "date", "datetime", or "time", default format is ISO8601 formatting for those respective types (see details on ISO8601 format for [Date](https://specs.frictionlessdata.io/table-schema/#date), -[Datetime](https://specs.frictionlessdata.io/table-schema/#datetime), -or [Time](https://specs.frictionlessdata.io/table-schema/#time)) - If you want to specify a date-like variable using standard Python/C strptime syntax, see [here](#format-details-for-date-datetime-time-type-variables) for details. -See [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) for more information about appropriate `format` values by variable `type`. - -[Additional information] - -Date Formats (date, datetime, time `type` variable): - -A format for a date variable (`date`,`time`,`datetime`). -**default**: An ISO8601 format string. -**any**: Any parsable representation of a date/time/datetime. The implementing library can attempt to parse the datetime via a range of strategies. - -**{PATTERN}**: The value can be parsed according to `{PATTERN}`, -which `MUST` follow the date formatting syntax of -C / Python [strftime](http://strftime.org/) such as: - -- "`%Y-%m-%d` (for date, e.g., 2023-05-25)" -- "`%Y%-%d` (for date, e.g., 20230525) for date without dashes" -- "`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)" -- "`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)" -- "`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)" -- "`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)" -- "`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)" -- "`%H:%M:%S` (for time, e.g., 10:30:45)" -- "`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)" -- "`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)" - -String formats: - -- "`email` if valid emails (e.g., test@gmail.com)" -- "`uri` if valid uri addresses (e.g., https://example.com/resource123)" -- "`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)" -- "`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)" - - -Geopoint formats: - -The two types of formats for `geopoint` (describing a geographic point). - -- `array` (if 'lat,long' (e.g., 36.63,-90.20)) -- `object` (if {'lat':36.63,'lon':-90.20}) +See [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) +for more information about appropriate `format` values by variable `type`. **`constraints`** _(object)_ @@ -364,6 +241,20 @@ Examples: a physical string representation to be cast as false (increasing readability of the field) that is not a standard false value. It can include one or more values. +Examples: + + +```json + + ['Not required', 'NOT REQUIRED'] + +``` + +```json + + ['No'] + +``` **`standardsMappings`** _(array)_ @@ -415,7 +306,7 @@ __**Only Instrument ID of HEAL CDE Mapped**__ { "instrument": { "source": "heal-cde", - "id": + "id": \1020 } } ] @@ -466,3 +357,51 @@ Two separate records. If desired, multiple standard mappings can be entered, say __**[Under development]**__ Mappings to a published set of concepts related to the given field such as ontological information (eg., NCI thesaurus, bioportal etc) + +### Additional `fields` property information + +#### `type` enum definitions: + +- `number` (A numeric value with optional decimal places. (e.g., 3.14)) +- `integer` (A whole number without decimal places. (e.g., 42)) +- `string` (A sequence of characters. (e.g., \"test\")) +- `any` (Any type of data is allowed. (e.g., true)) +- `boolean` (A binary value representing true or false. (e.g., true)) +- `date` (A specific calendar date. (e.g., \"2023-05-25\")) +- `datetime` (A specific date and time, including timezone information. (e.g., \"2023-05-25T10:30:00Z\")) +- `time` (A specific time of day. (e.g., \"10:30:00\")) +- `year` (A specific year. (e.g., 2023) +- `yearmonth` (A specific year and month. (e.g., \"2023-05\")) +- `duration` (A length of time. (e.g., \"PT1H\") +- `geopoint` (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278])) + +#### `format` examples/definitions of patterns and possible values: + +Examples of date time pattern formats + +- "`%Y-%m-%d` (for date, e.g., 2023-05-25)" +- "`%Y%-%d` (for date, e.g., 20230525) for date without dashes" +- "`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)" +- "`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)" +- "`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)" +- "`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)" +- "`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)" +- "`%H:%M:%S` (for time, e.g., 10:30:45)" +- "`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)" +- "`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)" + +Examples of string formats + +- "`email` if valid emails (e.g., test@gmail.com)" +- "`uri` if valid uri addresses (e.g., https://example.com/resource123)" +- "`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)" +- "`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)" + + +Examples of geopoint formats + +The two types of formats for `geopoint` (describing a geographic point). + +- `array` (if 'lat,long' (e.g., 36.63,-90.20)) +- `object` (if {'lat':36.63,'lon':-90.20}) + diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 7bab207..a438e71 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -1,6 +1,6 @@ { "version": "0.2.0", - "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"NOTE\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `encodings` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", + "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "title": "HEAL Variable Level Metadata Fields", "fields": [ { @@ -55,18 +55,18 @@ "type": "string", "constraints": { "enum": [ + "string", + "duration", "geopoint", - "year", - "integer", "any", - "string", - "datetime", + "integer", + "yearmonth", "number", + "datetime", + "time", "boolean", - "date", - "yearmonth", - "duration", - "time" + "year", + "date" ] } }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index fe627ba..93717bc 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -1,8 +1,6 @@ { - "$schema": "http://json-schema.org/draft-04/schema#", - "$id": "vlmd-fields", "title": "HEAL Variable Level Metadata Fields", - "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"NOTE\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `encodings` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", + "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "type": "object", "required": [ "name", @@ -16,23 +14,22 @@ "examples": [ "Demographics", "PROMIS", - "Substance use", - "Medical History", - "Sleep questions", - "Physical activity" + "Medical History" ] }, "name": { "type": "string", "title": "Variable Name", - "description": "The name of a variable (i.e., field) as it appears in the data. \n" + "description": "The name of a variable (i.e., field) as it appears in the data. \n", + "examples": [ + "gender_id" + ] }, "title": { "type": "string", "title": "Variable Label (ie Title)", - "description": "The human-readable title or label of the variable. \n", + "description": "The human-readable title or label of the variable.\n", "examples": [ - "My Variable", "Gender identity" ] }, @@ -48,7 +45,8 @@ "type": { "title": "Variable Type", "type": "string", - "description": "A classification or category of a particular data element or property expected or allowed in the dataset.\n\nDefinitions:\n\n- `number` (A numeric value with optional decimal places. (e.g., 3.14))\n- `integer` (A whole number without decimal places. (e.g., 42))\n- `string` (A sequence of characters. (e.g., \\\"test\\\"))\n- `any` (Any type of data is allowed. (e.g., true))\n- `boolean` (A binary value representing true or false. (e.g., true))\n- `date` (A specific calendar date. (e.g., \\\"2023-05-25\\\"))\n- `datetime` (A specific date and time, including timezone information. (e.g., \\\"2023-05-25T10:30:00Z\\\"))\n- `time` (A specific time of day. (e.g., \\\"10:30:00\\\"))\n- `year` (A specific year. (e.g., 2023)\n- `yearmonth` (A specific year and month. (e.g., \\\"2023-05\\\"))\n- `duration` (A length of time. (e.g., \\\"PT1H\\\")\n- `geopoint` (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278]))\n", + "description": "A classification or category of a particular data element or property expected or allowed in the dataset.\n", + "additionalDescription": "enum definitions:\n\n- `number` (A numeric value with optional decimal places. (e.g., 3.14))\n- `integer` (A whole number without decimal places. (e.g., 42))\n- `string` (A sequence of characters. (e.g., \\\"test\\\"))\n- `any` (Any type of data is allowed. (e.g., true))\n- `boolean` (A binary value representing true or false. (e.g., true))\n- `date` (A specific calendar date. (e.g., \\\"2023-05-25\\\"))\n- `datetime` (A specific date and time, including timezone information. (e.g., \\\"2023-05-25T10:30:00Z\\\"))\n- `time` (A specific time of day. (e.g., \\\"10:30:00\\\"))\n- `year` (A specific year. (e.g., 2023)\n- `yearmonth` (A specific year and month. (e.g., \\\"2023-05\\\"))\n- `duration` (A length of time. (e.g., \\\"PT1H\\\")\n- `geopoint` (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278]))\n", "enum": [ "number", "integer", @@ -67,7 +65,8 @@ "format": { "title": "Variable Format", "type": "string", - "description": "Indicates the format of the type specified in the `type` property. \nEach format is dependent on the `type` specified. \nFor example: If `type` is \"string\", then see the [String formats](https://specs.frictionlessdata.io/table-schema/#string). \nIf `type` is \"date\", \"datetime\", or \"time\", default format is ISO8601 formatting for those respective types (see details on ISO8601 format for [Date](https://specs.frictionlessdata.io/table-schema/#date),\n[Datetime](https://specs.frictionlessdata.io/table-schema/#datetime), \nor [Time](https://specs.frictionlessdata.io/table-schema/#time)) - If you want to specify a date-like variable using standard Python/C strptime syntax, see [here](#format-details-for-date-datetime-time-type-variables) for details. \nSee [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) for more information about appropriate `format` values by variable `type`. \n\n[Additional information]\n\nDate Formats (date, datetime, time `type` variable):\n\nA format for a date variable (`date`,`time`,`datetime`). \n**default**: An ISO8601 format string.\n**any**: Any parsable representation of a date/time/datetime. The implementing library can attempt to parse the datetime via a range of strategies.\n\n**{PATTERN}**: The value can be parsed according to `{PATTERN}`,\nwhich `MUST` follow the date formatting syntax of \nC / Python [strftime](http://strftime.org/) such as:\n\n- \"`%Y-%m-%d` (for date, e.g., 2023-05-25)\"\n- \"`%Y%-%d` (for date, e.g., 20230525) for date without dashes\"\n- \"`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)\"\n- \"`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)\"\n- \"`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)\"\n- \"`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)\"\n- \"`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)\"\n- \"`%H:%M:%S` (for time, e.g., 10:30:45)\"\n- \"`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)\"\n- \"`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)\"\n\nString formats:\n\n- \"`email` if valid emails (e.g., test@gmail.com)\"\n- \"`uri` if valid uri addresses (e.g., https://example.com/resource123)\"\n- \"`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)\"\n- \"`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)\"\n\n\nGeopoint formats:\n\nThe two types of formats for `geopoint` (describing a geographic point).\n\n- `array` (if 'lat,long' (e.g., 36.63,-90.20))\n- `object` (if {'lat':36.63,'lon':-90.20})\n" + "description": "Indicates the format of the type specified in the `type` property. \nEach format is dependent on the `type` specified. \nSee [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) \nfor more information about appropriate `format` values by variable `type`.\n", + "additionalDescription": "examples/definitions of patterns and possible values:\n\nExamples of date time pattern formats\n\n- \"`%Y-%m-%d` (for date, e.g., 2023-05-25)\"\n- \"`%Y%-%d` (for date, e.g., 20230525) for date without dashes\"\n- \"`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)\"\n- \"`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)\"\n- \"`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)\"\n- \"`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)\"\n- \"`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)\"\n- \"`%H:%M:%S` (for time, e.g., 10:30:45)\"\n- \"`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)\"\n- \"`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)\"\n\nExamples of string formats\n\n- \"`email` if valid emails (e.g., test@gmail.com)\"\n- \"`uri` if valid uri addresses (e.g., https://example.com/resource123)\"\n- \"`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)\"\n- \"`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)\"\n\n\nExamples of geopoint formats\n\nThe two types of formats for `geopoint` (describing a geographic point).\n\n- `array` (if 'lat,long' (e.g., 36.63,-90.20))\n- `object` (if {'lat':36.63,'lon':-90.20})\n" }, "constraints.maxLength": { "type": "integer", @@ -130,16 +129,18 @@ "description": "For boolean (true) variable (as defined in type field), this field allows\na physical string representation to be cast as true (increasing\nreadability of the field). It can include one or more values.\n", "examples": [ "Required|REQUIRED", - "required|Yes|Y|Checked", - "Checked", - "Required" + "Yes" ] }, "falseValues": { "title": "Boolean False Value Labels", "description": "For boolean (false) variable (as defined in type field), this field allows\na physical string representation to be cast as false (increasing\nreadability of the field) that is not a standard false value. It can include one or more values.\n", "type": "string", - "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" + "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$", + "examples": [ + "Not required| NOT REQUIRED", + "No" + ] }, "standardsMappings[0].instrument.url": { "type": "string", diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 72a080e..e679998 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -16,8 +16,14 @@ "description": { "type": "string" }, + "schemaVersion": { + "type": "string", + "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n", + "pattern": "\\d+\\.\\d+\\.\\d+" + }, "version": { - "type": "string" + "type": "string", + "description": "The specified individual data dictionary instance version." }, "standardsMappings": { "type": "array", @@ -54,10 +60,8 @@ "fields": { "type": "array", "items": { - "$schema": "http://json-schema.org/draft-04/schema#", - "$id": "vlmd-fields", "title": "HEAL Variable Level Metadata Fields", - "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"NOTE\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `encodings` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", + "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "type": "object", "required": [ "name", @@ -71,23 +75,22 @@ "examples": [ "Demographics", "PROMIS", - "Substance use", - "Medical History", - "Sleep questions", - "Physical activity" + "Medical History" ] }, "name": { "type": "string", "title": "Variable Name", - "description": "The name of a variable (i.e., field) as it appears in the data. \n" + "description": "The name of a variable (i.e., field) as it appears in the data. \n", + "examples": [ + "gender_id" + ] }, "title": { "type": "string", "title": "Variable Label (ie Title)", - "description": "The human-readable title or label of the variable. \n", + "description": "The human-readable title or label of the variable.\n", "examples": [ - "My Variable", "Gender identity" ] }, @@ -103,7 +106,8 @@ "type": { "title": "Variable Type", "type": "string", - "description": "A classification or category of a particular data element or property expected or allowed in the dataset.\n\nDefinitions:\n\n- `number` (A numeric value with optional decimal places. (e.g., 3.14))\n- `integer` (A whole number without decimal places. (e.g., 42))\n- `string` (A sequence of characters. (e.g., \\\"test\\\"))\n- `any` (Any type of data is allowed. (e.g., true))\n- `boolean` (A binary value representing true or false. (e.g., true))\n- `date` (A specific calendar date. (e.g., \\\"2023-05-25\\\"))\n- `datetime` (A specific date and time, including timezone information. (e.g., \\\"2023-05-25T10:30:00Z\\\"))\n- `time` (A specific time of day. (e.g., \\\"10:30:00\\\"))\n- `year` (A specific year. (e.g., 2023)\n- `yearmonth` (A specific year and month. (e.g., \\\"2023-05\\\"))\n- `duration` (A length of time. (e.g., \\\"PT1H\\\")\n- `geopoint` (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278]))\n", + "description": "A classification or category of a particular data element or property expected or allowed in the dataset.\n", + "additionalDescription": "enum definitions:\n\n- `number` (A numeric value with optional decimal places. (e.g., 3.14))\n- `integer` (A whole number without decimal places. (e.g., 42))\n- `string` (A sequence of characters. (e.g., \\\"test\\\"))\n- `any` (Any type of data is allowed. (e.g., true))\n- `boolean` (A binary value representing true or false. (e.g., true))\n- `date` (A specific calendar date. (e.g., \\\"2023-05-25\\\"))\n- `datetime` (A specific date and time, including timezone information. (e.g., \\\"2023-05-25T10:30:00Z\\\"))\n- `time` (A specific time of day. (e.g., \\\"10:30:00\\\"))\n- `year` (A specific year. (e.g., 2023)\n- `yearmonth` (A specific year and month. (e.g., \\\"2023-05\\\"))\n- `duration` (A length of time. (e.g., \\\"PT1H\\\")\n- `geopoint` (A pair of latitude and longitude coordinates. (e.g., [51.5074, -0.1278]))\n", "enum": [ "number", "integer", @@ -122,7 +126,8 @@ "format": { "title": "Variable Format", "type": "string", - "description": "Indicates the format of the type specified in the `type` property. \nEach format is dependent on the `type` specified. \nFor example: If `type` is \"string\", then see the [String formats](https://specs.frictionlessdata.io/table-schema/#string). \nIf `type` is \"date\", \"datetime\", or \"time\", default format is ISO8601 formatting for those respective types (see details on ISO8601 format for [Date](https://specs.frictionlessdata.io/table-schema/#date),\n[Datetime](https://specs.frictionlessdata.io/table-schema/#datetime), \nor [Time](https://specs.frictionlessdata.io/table-schema/#time)) - If you want to specify a date-like variable using standard Python/C strptime syntax, see [here](#format-details-for-date-datetime-time-type-variables) for details. \nSee [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) for more information about appropriate `format` values by variable `type`. \n\n[Additional information]\n\nDate Formats (date, datetime, time `type` variable):\n\nA format for a date variable (`date`,`time`,`datetime`). \n**default**: An ISO8601 format string.\n**any**: Any parsable representation of a date/time/datetime. The implementing library can attempt to parse the datetime via a range of strategies.\n\n**{PATTERN}**: The value can be parsed according to `{PATTERN}`,\nwhich `MUST` follow the date formatting syntax of \nC / Python [strftime](http://strftime.org/) such as:\n\n- \"`%Y-%m-%d` (for date, e.g., 2023-05-25)\"\n- \"`%Y%-%d` (for date, e.g., 20230525) for date without dashes\"\n- \"`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)\"\n- \"`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)\"\n- \"`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)\"\n- \"`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)\"\n- \"`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)\"\n- \"`%H:%M:%S` (for time, e.g., 10:30:45)\"\n- \"`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)\"\n- \"`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)\"\n\nString formats:\n\n- \"`email` if valid emails (e.g., test@gmail.com)\"\n- \"`uri` if valid uri addresses (e.g., https://example.com/resource123)\"\n- \"`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)\"\n- \"`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)\"\n\n\nGeopoint formats:\n\nThe two types of formats for `geopoint` (describing a geographic point).\n\n- `array` (if 'lat,long' (e.g., 36.63,-90.20))\n- `object` (if {'lat':36.63,'lon':-90.20})\n" + "description": "Indicates the format of the type specified in the `type` property. \nEach format is dependent on the `type` specified. \nSee [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) \nfor more information about appropriate `format` values by variable `type`.\n", + "additionalDescription": "examples/definitions of patterns and possible values:\n\nExamples of date time pattern formats\n\n- \"`%Y-%m-%d` (for date, e.g., 2023-05-25)\"\n- \"`%Y%-%d` (for date, e.g., 20230525) for date without dashes\"\n- \"`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)\"\n- \"`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)\"\n- \"`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)\"\n- \"`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)\"\n- \"`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)\"\n- \"`%H:%M:%S` (for time, e.g., 10:30:45)\"\n- \"`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)\"\n- \"`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)\"\n\nExamples of string formats\n\n- \"`email` if valid emails (e.g., test@gmail.com)\"\n- \"`uri` if valid uri addresses (e.g., https://example.com/resource123)\"\n- \"`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)\"\n- \"`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)\"\n\n\nExamples of geopoint formats\n\nThe two types of formats for `geopoint` (describing a geographic point).\n\n- `array` (if 'lat,long' (e.g., 36.63,-90.20))\n- `object` (if {'lat':36.63,'lon':-90.20})\n" }, "constraints": { "type": "object", @@ -228,11 +233,20 @@ "falseValues": { "title": "Boolean False Value Labels", "description": "For boolean (false) variable (as defined in type field), this field allows\na physical string representation to be cast as false (increasing\nreadability of the field) that is not a standard false value. It can include one or more values.\n", - "type": "array" + "type": "array", + "examples": [ + [ + "Not required", + "NOT REQUIRED" + ], + [ + "No" + ] + ] }, "standardsMappings": { "type": "array", - "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", + "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \\1020\n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", "items": { "type": "object", "properties": { diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index 59df3a8..2de1e47 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -2,6 +2,7 @@ { "title": null, "description": null, + "schemaVersion": null, "version": null, "standardsMappings": [ { From f8a27593c5174213395609483d12c2ca78e708d1 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Wed, 3 Jan 2024 17:25:52 -0600 Subject: [PATCH 22/72] make example in description valid (ie a string instead of int) --- .../schemas/dictionary/definitions.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml index 7854d9c..6a58eed 100644 --- a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml @@ -106,7 +106,7 @@ fieldStandardsMappingsItem: { "instrument": { "source": "heal-cde", - "id": \1020 + "id": "1020" } } ] From e840e62a066ea4cf620627d113854d18bdd22928 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Wed, 3 Jan 2024 17:26:08 -0600 Subject: [PATCH 23/72] build based on update --- .../jsonschema-csvtemplate-fields.html | 2 +- .../jsonschema-jsontemplate-data-dictionary.html | 4 ++-- .../jsonschema-jsontemplate-data-dictionary.md | 2 +- .../schemas/frictionless/csvtemplate/fields.json | 14 +++++++------- .../schemas/jsonschema/data-dictionary.json | 2 +- 5 files changed, 12 insertions(+), 12 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 66b0424..cc61b42 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -17,4 +17,4 @@
"No"
 

Type: stringFormat: uri

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index b24cff2..9efc379 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -79,7 +79,7 @@ { "instrument": { "source": "heal-cde", - "id": \1020 + "id": "1020" } } ] @@ -111,4 +111,4 @@ ]

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index 455be18..ef265d9 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -306,7 +306,7 @@ __**Only Instrument ID of HEAL CDE Mapped**__ { "instrument": { "source": "heal-cde", - "id": \1020 + "id": "1020" } } ] diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index a438e71..e95a661 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -55,18 +55,18 @@ "type": "string", "constraints": { "enum": [ + "integer", "string", + "date", + "time", "duration", - "geopoint", - "any", - "integer", + "datetime", "yearmonth", "number", - "datetime", - "time", - "boolean", + "geopoint", "year", - "date" + "boolean", + "any" ] } }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index e679998..8300c72 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -246,7 +246,7 @@ }, "standardsMappings": { "type": "array", - "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \\1020\n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", + "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \"1020\"\n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", "items": { "type": "object", "properties": { From df7d96ab24180da5a8e8aceb985ac665148d4d8e Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Thu, 4 Jan 2024 10:11:46 -0600 Subject: [PATCH 24/72] init test for gh admonitions --- variable-level-metadata-schema/README.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index 405bc74..72f1aeb 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -1,8 +1,6 @@ # Variable level metadata -This metadata directory contains the specifications for variable level metadata submissions to the -HEAL platform in addition to variable level metadata templates in CSV format and the associated code -converting this template to its validated json format. +This metadata directory contains the specifications for variable level metadata documents. ## Workflow @@ -29,4 +27,10 @@ To contribute to the variable level metadata, please modify the `dictionary/*.ya ## Considerations -Please use github issues for any additional considerations. See additional comments above. \ No newline at end of file +Please use github issues for any additional considerations. See additional comments above. + +:::warning + + This is a warning admonition. + +::: \ No newline at end of file From 6259272fd75e0e4ab1ad54fea26785b7de4a02cf Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Thu, 4 Jan 2024 13:33:18 -0600 Subject: [PATCH 25/72] Update vlmd README.md --- variable-level-metadata-schema/README.md | 61 +++++++++++++++++++----- 1 file changed, 50 insertions(+), 11 deletions(-) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index 72f1aeb..6659030 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -1,36 +1,75 @@ # Variable level metadata -This metadata directory contains the specifications for variable level metadata documents. +This metadata directory contains the specifications for variable level metadata documents in the HEAL data ecosystem. +## Schemas +There are three categories of schemas important in this directory: + +### json data dictionary format specification +1. `schemas/jsonschema/data-dictionary.json`: The "json" json data dictionary schema (ie json template schema) + - Intended to specify the data dictionary representation of json objects available in the HEAL platform metadata-service. + +### csv field format specifications + +2. `schemas/frictionless/fields.json` Table schema (previously known as "frictionless") standard specification + - This json file is intended to represent csv data dictionary documents following the [Table Schema specification](https://specs.frictionlessdata.io/table-schema/). + - Csv version is intended to make data dictionary creation and discovery available in a more familiar/human readable format, + - The representation of data dictionary field values in a csv file. It's used to facilitate documentation of data dictionary csv + files in addition to input validation. +3. `schemas/jsontemplate/fields.json`The "csv" json schema (ie csv template schema) + - :warning: The "csv" json schema is intended to be an intermediate specification used for documentation and in translation workflows to the json schema template. As fully specifying a tabular file (for example missing value specification) is out of scope here (see the table schema representation in (2)) ## Workflow The `schemas/dictionary` directory contains a comprehensive json schema with fields for + + ## Directories - `docs`: See the rendered human readable schemas in a markdown format and an interactive html format. -- `schemas/jsonschema`: `data_dictionary.json` contains the final and full specification. -- `schemas/frictionless/csvtemplate`: contains schemas following the frictionless schema specifications. `fields.json` contains the frictionless Table Schema descriptor that validates a tabular heal templated csv data dictionary. See [here](https://specs.frictionlessdata.io/table-schema/) for the specification. **NOTE: the `csvtemplate` is an intermediate format meant to be converted into the final `jsontemplate` format. -- `schemas/dictionary`: the yaml files used to generate json schemas with build.py. Fields with `jsonSpec` and `csvSpec` keys to indicate which property to extract in the `build.py` script. +- `schemas/jsonschema`: contains the final and full specification. +- `schemas/frictionless`: contains schemas following the frictionless schema specifications. `fields.json` contains the frictionless Table Schema descriptor that validates a tabular heal templated csv data dictionary. See [here](https://specs.frictionlessdata.io/table-schema/) for the specification. **NOTE: the `csvtemplate` is an intermediate format meant to be converted into the final `jsontemplate` format. +- `schemas/dictionary`: the yaml files used to generate json schemas and documentation with build.py. - `templates`: empty templates in csv spreadsheet format and JSON format. -- `examples`: the ~~(filled out)~~ templates in csv spreadsheet format and JSON format. +- `examples`: exapmles of filled out templates in csv spreadsheet format and JSON format. TO BE ADDED: for now, see https://github.com/norc-heal/healdata-utils/tree/main/tests/data/valid/output - `build.py`: This script compiles the yaml files and generates associated jsonschemas and frictionless schemas in addition to the human rendered schemas ## Contributing -To contribute to the variable level metadata, please modify the `dictionary/*.yaml` files directly. For example, if you want to add/modify an example, description, etc for either the JSON or CSV spec, then do so here. +To contribute to the variable level metadata specification (and annotations/examples/documentation), please modify the `dictionary/*.yaml` files directly. +❗ Please read the below conventions and principles before contributing and review the existing `dictionary` directory. -## Considerations -Please use github issues for any additional considerations. See additional comments above. +## Conventions, principles, and rules + +### Annotation/documentation properties +1. `description`: SHOULD be created as markdown syntax without any headers as headers are applied in the templates. -:::warning +2. `additionalDescription`: SHOULD be added if there are additional documentation "footer" details. In rendering the documentation, these are appended to the end of rendered markdown document. - This is a warning admonition. +### `type` conversion rules +Given csv field values can only be scalar values with records separated by a new line and each individual field values separated by a comma delimiter, the following rules and restrictions are applied to allow json to csv specification translation. + +1. type `object` + - converted to type `string` with pattern of `^(?:.*?=.*?(?:\||$))+$` to indicate a stringified object with a equal sign (`=`) connecting the key-value pair and a pipe (`|`) delimiter separating unique key-value pairs. +2. type `array` + - if type `object` in `items`: flattened to the children property or properties + - if type is a scalar (`string`,`integer`,`number`) in `items`, + translated to type `string` with pattern `^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$` to indicate a string containing a pipe delimiter (i.e., a stringified array with a pipe delimiter) + +### Complex `type` restrictions + +1. Currently, no complex types (`anyOf`,`oneOf`) and the `type` MUST be specified. +2. `enum` restrictions + - following from (1), an `enum` must only contain values of the same type + - (at least currently) MUST contain only scalar types (`string`,`integer`,`number`) + + +## Considerations -::: \ No newline at end of file +Please use github issues for any additional considerations. See additional comments above. \ No newline at end of file From 3d2c7f18c2089e7066865d240e68414a608ee086 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Thu, 4 Jan 2024 13:33:46 -0600 Subject: [PATCH 26/72] remove Csv and json spec suffix given established translation pattern --- .../schemas/dictionary/fields.yaml | 74 ++----------------- 1 file changed, 5 insertions(+), 69 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index ffb304a..c9429e1 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -135,24 +135,10 @@ properties: object). For example, if 'Hello World' is the longest value of a categorical variable, this would be a maxLength of 11. - enumJsonSpec: + enum: title: Variable Possible Values description: | Constrains possible values to a set of values. - - type: array - examples: - - [1,2,3,4] - - ["White","Black or African American","American Indian or Alaska Native","Native Hawaiian or Other Pacific Islander","Asian","Some other race","Multiracial"] - enumCsvSpec: - title: Variable Possible Values - description: | - Constrains possible values to a set of values. - - $ref: "#/definitions/csvArray" - examples: - - 1|2|3|4|5|6|7|8 - - White|Black or African American|American Indian or Alaska Native|Native Hawaiian or Other Pacific Islander|Asian|Some other race|Multiracial pattern: type: string title: Regular Expression Pattern @@ -173,7 +159,7 @@ properties: description: | Specifies the minimum value of a field. - enumLabelsJsonSpec: + enumLabels: title: 'Variable Value Encodings (i.e., mappings; value labels)' description: | Variable value encodings provide a way to further annotate any value within a any variable type, @@ -192,25 +178,6 @@ properties: examples: - {"0":"No","1":"Yes"} - {"HW":"Hello world","GBW":"Good bye world","HM":"Hi, Mike"} - enumLabelsCsvSpec: - title: 'Variable Value Encodings (i.e., mappings; value labels)' - description: | - Variable value encodings provide a way to further annotate any value within a any variable type, - making values easier to understand. - - - Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms - only support numerical values. Encodings (and mappings) allow categorical values to be stored as - numerical values. - - Additionally, as another use case, this field provides a way to - store categoricals that are stored as "short" labels (such as - abbreviations). - - $ref: "#/definitions/csvObject" - examples: - - '0=No|1=Yes' - - 'HW=Hello world|GBW=Good bye world|HM=Hi,Mike' enumOrdered: title: An ordered variable description: | @@ -220,7 +187,7 @@ properties: < Neutral < Agree). type: boolean - missingValuesJsonSpec: + missingValues: title: Missing Values description: | A list of missing values specific to a variable. @@ -231,17 +198,7 @@ properties: - ["Missing","Skipped","No preference"] - ["Missing"] type: array - missingValuesCsvSpec: - title: Missing Values - description: | - A list of missing values specific to a variable. - - examples: - - - Missing|Skipped|No preference - - Missing - $ref: "#/definitions/csvArray" - trueValuesJsonSpec: + trueValues: title: Boolean True Value Labels description: | For boolean (true) variable (as defined in type field), this field allows @@ -254,18 +211,7 @@ properties: examples: - ["required","Yes","Checked"] - ["required"] - - trueValuesCsvSpec: - $ref: "#/definitions/csvArray" - description: | - For boolean (true) variable (as defined in type field), this field allows - a physical string representation to be cast as true (increasing - readability of the field). It can include one or more values. - - examples: - - Required|REQUIRED - - "Yes" - falseValuesJsonSpec: + falseValues: title: Boolean False Value Labels description: | For boolean (false) variable (as defined in type field), this field allows @@ -275,16 +221,6 @@ properties: examples: - ["Not required","NOT REQUIRED"] - ["No"] - falseValuesCsvSpec: - title: Boolean False Value Labels - description: | - For boolean (false) variable (as defined in type field), this field allows - a physical string representation to be cast as false (increasing - readability of the field) that is not a standard false value. It can include one or more values. - $ref: "#/definitions/csvArray" - examples: - - Not required| NOT REQUIRED - - "No" standardsMappings: $ref: "#/definitions/fieldStandardsMappingsItem" relatedConcepts: From e47531c897c18fd7d98afc294f96b1ccc92b6abc Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Thu, 4 Jan 2024 17:57:45 -0600 Subject: [PATCH 27/72] WIP documentation on conventions and rules --- variable-level-metadata-schema/README.md | 89 +++++++++++++++++++++--- 1 file changed, 81 insertions(+), 8 deletions(-) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index 6659030..c6ab0c4 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -3,7 +3,8 @@ This metadata directory contains the specifications for variable level metadata documents in the HEAL data ecosystem. ## Schemas -There are three categories of schemas important in this directory: + +❗ Look here for schema specifications. ### json data dictionary format specification 1. `schemas/jsonschema/data-dictionary.json`: The "json" json data dictionary schema (ie json template schema) @@ -19,29 +20,69 @@ There are three categories of schemas important in this directory: 3. `schemas/jsontemplate/fields.json`The "csv" json schema (ie csv template schema) - :warning: The "csv" json schema is intended to be an intermediate specification used for documentation and in translation workflows to the json schema template. As fully specifying a tabular file (for example missing value specification) is out of scope here (see the table schema representation in (2)) -## Workflow +## Document flow chart + +```mermaid + +%%{init: {"flowchart": {"defaultRenderer": "elk","htmlLabels": false}} }%% +flowchart TD + subgraph "/schemas" + subgraph dictionary[Dictionary YAML files] + + defs["/dictionary/definitions.yaml"] + fields["/dictionary/fields.yaml"] + dd["/dictionary/data-dictionary.yaml"] + end + + subgraph Schema specifications + + jsonspec["/jsontemplate/data-dictionary.json"] + csvspec["/jsontemplate/csvtemplate/fields.json"] + csvtblspec["/frictionless/csvtemplate/fields.json"] + end + end -The `schemas/dictionary` directory contains a comprehensive json schema with fields for + subgraph /docs + subgraph "Rendered schema documentation \n(html also available)" + csvmd["/docs/\nmd-rendered-schemas/\njsonschema-csvtemplate-fields.md"] + jsonmd["/docs/\nmd-rendered-schemas/\njsonschema-jsontemplate-data-dictionary.md"] + end + end + + defs --> fields --> dd + defs --> dd + + fields --> csvspec --> csvtblspec + dd --> jsonspec + + csvspec --> csvmd + jsonspec --> jsonmd + +``` ## Directories - `docs`: See the rendered human readable schemas in a markdown format and an interactive html format. -- `schemas/jsonschema`: contains the final and full specification. -- `schemas/frictionless`: contains schemas following the frictionless schema specifications. `fields.json` contains the frictionless Table Schema descriptor that validates a tabular heal templated csv data dictionary. See [here](https://specs.frictionlessdata.io/table-schema/) for the specification. **NOTE: the `csvtemplate` is an intermediate format meant to be converted into the final `jsontemplate` format. +- `schemas/jsonschema`: contains the final and full specification for schemas following json schema. +- `schemas/frictionless`: contains schemas following the frictionless table schema specifications. See [here](https://specs.frictionlessdata.io/table-schema/) for the specification. - `schemas/dictionary`: the yaml files used to generate json schemas and documentation with build.py. - `templates`: empty templates in csv spreadsheet format and JSON format. - `examples`: exapmles of filled out templates in csv spreadsheet format and JSON format. - TO BE ADDED: for now, see https://github.com/norc-heal/healdata-utils/tree/main/tests/data/valid/output - `build.py`: This script compiles the yaml files and generates associated jsonschemas and frictionless schemas in addition to the human rendered schemas ## Contributing To contribute to the variable level metadata specification (and annotations/examples/documentation), please modify the `dictionary/*.yaml` files directly. +1. Update the dictionary/*.yaml files +2. Run `build.py` script +3. Check output is correct (see above) +4. When satisfied, push to github and ensure it passes validation (ie commit has ✔️ and not ❌) + ❗ Please read the below conventions and principles before contributing and review the existing `dictionary` directory. @@ -61,13 +102,45 @@ Given csv field values can only be scalar values with records separated by a new - if type `object` in `items`: flattened to the children property or properties - if type is a scalar (`string`,`integer`,`number`) in `items`, translated to type `string` with pattern `^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$` to indicate a string containing a pipe delimiter (i.e., a stringified array with a pipe delimiter) +### `property` name conversion rules +To facilitate the mapping of json spec property names to csv property names, the resulting flattened `property` names from the flattened properties should correspond to the jsonpath representation where: + +1. type `object` + + ```json + + "constraints": { + "type": "object", + "properties": { + "maxLength": { + "type": "integer"} + } + } + ``` + + flattens to: + ```json + + "constraints.maxLength":{"type":"integer"} + + ``` +2. type `array` + + ```json + + ``` + + flattens to: + ```json + + ``` ### Complex `type` restrictions -1. Currently, no complex types (`anyOf`,`oneOf`) and the `type` MUST be specified. +1. Currently, no complex types (`anyOf`,`oneOf`) are supported and the `type` MUST be specified. This is to ensure coverage for all csv to json translation use cases. 2. `enum` restrictions - following from (1), an `enum` must only contain values of the same type - - (at least currently) MUST contain only scalar types (`string`,`integer`,`number`) + - (at least currently) MUST contain only types supported by csv fields which include scalar types (`string`,`integer`,`number`) in addition to type `object` as this has a stringified representation (see above). ## Considerations From f5358514ef20a6f48a7f633a35acfca743bcbf33 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Thu, 4 Jan 2024 19:16:48 -0600 Subject: [PATCH 28/72] additional json to csv property name example --- variable-level-metadata-schema/README.md | 26 ++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index c6ab0c4..def4cbb 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -127,12 +127,34 @@ To facilitate the mapping of json spec property names to csv property names, th 2. type `array` ```json + { "..more props..":"...", + "standardsMappings": { + "type": "array", + "items": { + "type": "object", + "properties": { + "instrument": { + "type": "object", + "properties": { + "url": { + "type": "string", + "format": "uri" + }, + "..more props..":"..."} + }, + "..more props..":"..."} + }}} ``` - flattens to: + ```json - + { "..more props..":"...", + "standardsMappings[0].instrument.url": { + "type": "string", + "format": "uri" + } + } ``` ### Complex `type` restrictions From c397ba22030f758bba710b501d10275ddefe07b2 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Thu, 4 Jan 2024 19:38:25 -0600 Subject: [PATCH 29/72] slight formatting updates to templates --- .../docs/assets/templates/csvtemplate.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md b/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md index 087bdeb..7d403f6 100644 --- a/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md +++ b/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md @@ -1,15 +1,16 @@ # {{ schema.title }} _version {{ schema.version }}_ + {{ schema.description }} -## Properties +## Properties (i.e., fields or variables) {% for itemname,item in schema.properties.items() %} {% include 'properties.md' %} {% endfor %} -# End of schema - Additional Property information +## End of schema - Additional Property information {% for itemname,item in schema['properties'].items() %} {% if 'additionalDescription' in item %} From 9fdac53592153313fe828abcbad5848f2f09cef3 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Thu, 4 Jan 2024 19:48:41 -0600 Subject: [PATCH 30/72] WIP init function for json to csv translation with new convention-based method --- variable-level-metadata-schema/build.py | 43 +++---------------------- 1 file changed, 5 insertions(+), 38 deletions(-) diff --git a/variable-level-metadata-schema/build.py b/variable-level-metadata-schema/build.py index ee0836b..37a67d3 100644 --- a/variable-level-metadata-schema/build.py +++ b/variable-level-metadata-schema/build.py @@ -31,41 +31,9 @@ def load_all_yamls(directory="schemas/dictionary"): filepaths = Path(directory).glob("*.yaml") return {filepath.stem: load_yaml(filepath) for filepath in filepaths} - -def select_specs(schema, specsuffix="CsvSpec"): - """ - select given specification type and remove other specification types. - These are denoted with the suffix (eg encodingsCsvSpec) in property name - - This function is useful when building multiple versions of schemas - conditional on the type of specificaiton (eg csv tabular data vs. json - for a workflow that may except csv that is translated into the json file.) - - """ - # loop through schema - schema_selected = {} - for key, item in schema.items(): - if re.search(f"{specsuffix}$", key): - newkey = key.replace(specsuffix, "") - schema_selected[newkey] = item - elif re.search("Spec$", key): - pass - elif isinstance(item, MutableMapping): - schema_selected[key] = select_specs(item, specsuffix) - else: - schema_selected[key] = item - return schema_selected - - -# resolve refs (and select type of schema spec) - -def get_ref(path,schema): - pass - -# loop through all iterables in a dictionary -# if key = $ref --> get_ref - - +def to_csv_spec(schema,*args,**kwargs): + # see the flatten_schema (and properties) function as it will use similar pattern + pass def resolve_refs(items, schema, parentkey=False): """ @@ -264,7 +232,7 @@ def generate_template(schema): # compile frictionless schema fields dictionary = load_all_yamls() csv_pipeline = [ - (select_specs, {"specsuffix": "CsvSpec"}), + (to_csv_spec, None), # recursive fxn so need to grab items from overall dictionary for json paths (resolve_refs, {"schema": dictionary}), # no longer need the definitons as they have been resolved @@ -281,7 +249,7 @@ def generate_template(schema): # compile json schema fields csv_pipeline = [ - (select_specs, {"specsuffix": "CsvSpec"}), + (to_csv_spec, None), # recursive fxn so need to grab items from overall dictionary for json paths (resolve_refs, {"schema": dictionary}), # no longer need the definitons as they have been resolved @@ -295,7 +263,6 @@ def generate_template(schema): json_pipeline = [ # recursive fxn so need to grab items from overall dictionary for json paths (resolve_refs, {"schema": dictionary}), - (select_specs, {"specsuffix": "JsonSpec"}), # no longer need the definitons as they have been resolved (lambda _schema: _schema["data-dictionary"], None), (lambda _schema: {"version":versions["vlmd"],**_schema},None) From 4c6b0126ea308a3575f31f691cdc7903d6758baf Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 09:29:27 -0600 Subject: [PATCH 31/72] fix: missing type for enum --- variable-level-metadata-schema/schemas/dictionary/fields.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index c9429e1..3aba3b3 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -136,6 +136,7 @@ properties: categorical variable, this would be a maxLength of 11. enum: + type: array title: Variable Possible Values description: | Constrains possible values to a set of values. From 045dfee89f82ea39e1d4cbc6af3b279ded0800f3 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 09:30:04 -0600 Subject: [PATCH 32/72] add enumLabel and enumOrder official pattern examples --- .../schemas/dictionary/fields.yaml | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 3aba3b3..1908d5d 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -140,6 +140,9 @@ properties: title: Variable Possible Values description: | Constrains possible values to a set of values. + examples: + - [1,2,3,4,5] + - ["Poor","Fair","Good","Very good","Excellent"] pattern: type: string title: Regular Expression Pattern @@ -175,9 +178,11 @@ properties: store categoricals that are stored as "short" labels (such as abbreviations). + This field is intended to follow [this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering) + type: object examples: - - {"0":"No","1":"Yes"} + - {"1":"Poor","2":"Fair","3":"Good","4":"Very good","5":"Excellent"} - {"HW":"Hello world","GBW":"Good bye world","HM":"Hi, Mike"} enumOrdered: title: An ordered variable @@ -187,13 +192,14 @@ properties: necessarily a numerical relationship (e.g., Strongly disagree < Disagree < Neutral < Agree). + This field is intended to follow the ordering aspect of this [this pattern][this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering) + type: boolean missingValues: title: Missing Values description: | A list of missing values specific to a variable. - examples: - ["Missing","Skipped","No preference"] From 4e05314a9c3aaca8016ee83dc261145bb14de2fc Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 09:30:49 -0600 Subject: [PATCH 33/72] add boolean to list of scalars in README --- variable-level-metadata-schema/README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index def4cbb..41e1490 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -160,9 +160,11 @@ To facilitate the mapping of json spec property names to csv property names, th ### Complex `type` restrictions 1. Currently, no complex types (`anyOf`,`oneOf`) are supported and the `type` MUST be specified. This is to ensure coverage for all csv to json translation use cases. + - Each json specification schema property type must be a scalar (e.g., `boolean`,`string`,`integer`,`number`), an `array`, or an `object` + - Each csv specification schema property type must be a scalar (e.g., `boolean`,`string`,`integer`,`number`) 2. `enum` restrictions - following from (1), an `enum` must only contain values of the same type - - (at least currently) MUST contain only types supported by csv fields which include scalar types (`string`,`integer`,`number`) in addition to type `object` as this has a stringified representation (see above). + - (at least currently) MUST contain only types supported by csv fields which include scalar types (e.g., `boolean`,`string`,`integer`,`number`) in addition to type `object` as this has a stringified representation (see above). ## Considerations From 72999c2f3eb6e8f765e1febf310a09e141544a0c Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 09:52:51 -0600 Subject: [PATCH 34/72] finish implementation of csv translation rules --- variable-level-metadata-schema/build.py | 45 +++++++++++++++++++++---- 1 file changed, 39 insertions(+), 6 deletions(-) diff --git a/variable-level-metadata-schema/build.py b/variable-level-metadata-schema/build.py index 37a67d3..bebc1dd 100644 --- a/variable-level-metadata-schema/build.py +++ b/variable-level-metadata-schema/build.py @@ -31,10 +31,6 @@ def load_all_yamls(directory="schemas/dictionary"): filepaths = Path(directory).glob("*.yaml") return {filepath.stem: load_yaml(filepath) for filepath in filepaths} -def to_csv_spec(schema,*args,**kwargs): - # see the flatten_schema (and properties) function as it will use similar pattern - pass - def resolve_refs(items, schema, parentkey=False): """ resolve pseudo-json references @@ -77,6 +73,42 @@ def resolve_refs(items, schema, parentkey=False): return schema_resolved +def to_csv_properties(schema): + """ + translate complex types (eg arrays and objects) to stringified representations + """ + csv_schema = dict(schema) + csv_schema["properties"] = {} + properties = schema["properties"] + for key, item in properties.items(): + typename = item.get("type") + newitem = dict(item) + if typename == "array": + newitem["type"] = "string" + newitem["pattern"] = "^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$" + + if item.get("examples"): + newitem["examples"] = ["|".join(str(_e) for _e in e) for e in item["examples"]] + elif typename == "object": + newitem["type"] = "string" + newitem["pattern"] = "^(?:.*?=.*?(?:\||$))+$" + + if item.get("examples"): + newitem["examples"] = [ + "|".join([f"{key}={val}" for key,val in e.items()]) + for e in item["examples"] + ] + elif typename in ["string","integer","number","boolean"]: + newitem = dict(item) + else: + raise Exception("To convert to csv, the flattened property needs to be", + "of type array,object,boolean,string, integer, or number") + + csv_schema["properties"][key] = newitem + + + + return csv_schema def flatten_properties(properties, parentkey="", sep=".",itemsep="[0]"): """ @@ -232,12 +264,12 @@ def generate_template(schema): # compile frictionless schema fields dictionary = load_all_yamls() csv_pipeline = [ - (to_csv_spec, None), # recursive fxn so need to grab items from overall dictionary for json paths (resolve_refs, {"schema": dictionary}), # no longer need the definitons as they have been resolved (lambda _schema: _schema["fields"], None), (flatten_schema, None), + (to_csv_properties,None), (to_frictionless, None), (lambda _schema: {"version":versions["vlmd"],**_schema},None) ] @@ -249,12 +281,13 @@ def generate_template(schema): # compile json schema fields csv_pipeline = [ - (to_csv_spec, None), # recursive fxn so need to grab items from overall dictionary for json paths (resolve_refs, {"schema": dictionary}), # no longer need the definitons as they have been resolved (lambda _schema: _schema["fields"], None), (flatten_schema, None), + (to_csv_properties,None), + (lambda _schema: {"version":versions["vlmd"],**_schema},None) ] csvfields = reduce(run_pipeline_step, csv_pipeline, dictionary) Path("schemas/jsonschema/csvtemplate/fields.json").write_text(json.dumps(csvfields, indent=4)) From b281c32d96bc9d5cba21277910059a72e76d9e17 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 11:46:18 -0600 Subject: [PATCH 35/72] add schemaVersion/cascading properties to rules and build for csv spec --- variable-level-metadata-schema/README.md | 8 +++++ variable-level-metadata-schema/build.py | 44 +++++++++++++----------- 2 files changed, 31 insertions(+), 21 deletions(-) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index 41e1490..304d113 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -166,6 +166,14 @@ To facilitate the mapping of json spec property names to csv property names, th - following from (1), an `enum` must only contain values of the same type - (at least currently) MUST contain only types supported by csv fields which include scalar types (e.g., `boolean`,`string`,`integer`,`number`) in addition to type `object` as this has a stringified representation (see above). +### csv to json cascading + +- If the same value/instance of a property exists at the field level for ALL records (only one unique value) AND this same property is specified in the root level of json specification (after translation of above csv to json property rules), then this unique value will be added to the json root level property BEFORE the translated json document is validated. + +This provides a way to specify root level properties within vlmd csv documents for a few use cases: + +1. specifying the schema version that represents the vlmd document (`schemaVersion`) +2. specifying other data dictionary level properties such as `standardsMappings[0].instrument` ## Considerations diff --git a/variable-level-metadata-schema/build.py b/variable-level-metadata-schema/build.py index bebc1dd..89447aa 100644 --- a/variable-level-metadata-schema/build.py +++ b/variable-level-metadata-schema/build.py @@ -73,7 +73,7 @@ def resolve_refs(items, schema, parentkey=False): return schema_resolved -def to_csv_properties(schema): +def to_csv_properties(schema,**additional_props): """ translate complex types (eg arrays and objects) to stringified representations """ @@ -106,8 +106,9 @@ def to_csv_properties(schema): csv_schema["properties"][key] = newitem + # add additional properties at the beginning of the schema properties object + csv_schema["properties"] = {**additional_props,**csv_schema["properties"]} - return csv_schema def flatten_properties(properties, parentkey="", sep=".",itemsep="[0]"): @@ -263,13 +264,26 @@ def generate_template(schema): if __name__ == "__main__": # compile frictionless schema fields dictionary = load_all_yamls() + + # compile json schema fields + json_pipeline = [ + # recursive fxn so need to grab items from overall dictionary for json paths + (resolve_refs, {"schema": dictionary}), + # no longer need the definitons as they have been resolved + (lambda _schema: _schema["data-dictionary"], None), + (lambda _schema: {"version":versions["vlmd"],**_schema},None) + ] + json_data_dictionary = reduce(run_pipeline_step, json_pipeline, dictionary) + Path("schemas/jsonschema/data-dictionary.json").write_text(json.dumps(json_data_dictionary, indent=4)) + + schema_version_prop = {"schemaVersion":json_data_dictionary["properties"]["schemaVersion"]} csv_pipeline = [ # recursive fxn so need to grab items from overall dictionary for json paths (resolve_refs, {"schema": dictionary}), # no longer need the definitons as they have been resolved (lambda _schema: _schema["fields"], None), (flatten_schema, None), - (to_csv_properties,None), + (to_csv_properties,schema_version_prop), (to_frictionless, None), (lambda _schema: {"version":versions["vlmd"],**_schema},None) ] @@ -286,24 +300,12 @@ def generate_template(schema): # no longer need the definitons as they have been resolved (lambda _schema: _schema["fields"], None), (flatten_schema, None), - (to_csv_properties,None), + (to_csv_properties,schema_version_prop), (lambda _schema: {"version":versions["vlmd"],**_schema},None) ] csvfields = reduce(run_pipeline_step, csv_pipeline, dictionary) Path("schemas/jsonschema/csvtemplate/fields.json").write_text(json.dumps(csvfields, indent=4)) - # compile json schema fields - json_pipeline = [ - # recursive fxn so need to grab items from overall dictionary for json paths - (resolve_refs, {"schema": dictionary}), - # no longer need the definitons as they have been resolved - (lambda _schema: _schema["data-dictionary"], None), - (lambda _schema: {"version":versions["vlmd"],**_schema},None) - ] - jsonfields = reduce(run_pipeline_step, json_pipeline, dictionary) - Path("schemas/jsonschema/data-dictionary.json").write_text(json.dumps(jsonfields, indent=4)) - - # generate json schema versions of field schemas for documentation # generate html using the json-schema for human library @@ -317,14 +319,14 @@ def generate_template(schema): item=csvfields, schema=csvfields, templatefile="csvtemplate.md") - jsonfields_md = render_markdown( - item=jsonfields, - schema=jsonfields, + json_dd_md = render_markdown( + item=json_data_dictionary, + schema=json_data_dictionary, templatefile="jsontemplate.md" ) Path("docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md").write_text(csvfields_md) - Path("docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md").write_text(jsonfields_md) + Path("docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md").write_text(json_dd_md) # generate templates - Path("templates/template_submission.json").write_text(json.dumps([generate_template(jsonfields)],indent=4)) + Path("templates/template_submission.json").write_text(json.dumps([generate_template(json_data_dictionary)],indent=4)) Path("templates/template_submission.csv").write_text(",".join((generate_template(csvfields)).keys())) \ No newline at end of file From 19441534856ebcd542da4ac983d09ed5ffd1ad01 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 11:48:08 -0600 Subject: [PATCH 36/72] minor README updates --- variable-level-metadata-schema/README.md | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index 304d113..7ccceeb 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -9,8 +9,11 @@ This metadata directory contains the specifications for variable level metadata ### json data dictionary format specification 1. `schemas/jsonschema/data-dictionary.json`: The "json" json data dictionary schema (ie json template schema) - Intended to specify the data dictionary representation of json objects available in the HEAL platform metadata-service. + - See here for the markdown rendered version --> [`docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md`](docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md) ### csv field format specifications +- See here for the markdown rendered version --> [`docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md`](docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md) + 2. `schemas/frictionless/fields.json` Table schema (previously known as "frictionless") standard specification - This json file is intended to represent csv data dictionary documents following the [Table Schema specification](https://specs.frictionlessdata.io/table-schema/). @@ -103,10 +106,11 @@ Given csv field values can only be scalar values with records separated by a new - if type is a scalar (`string`,`integer`,`number`) in `items`, translated to type `string` with pattern `^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$` to indicate a string containing a pipe delimiter (i.e., a stringified array with a pipe delimiter) ### `property` name conversion rules -To facilitate the mapping of json spec property names to csv property names, the resulting flattened `property` names from the flattened properties should correspond to the jsonpath representation where: +To facilitate the mapping of json spec property names to csv property names, the resulting flattened `property` names from the flattened properties should correspond to the [jsonpath](https://datatracker.ietf.org/doc/id/draft-goessner-dispatch-jsonpath-00.html) representation where: 1. type `object` + The json spec type object property below: ```json "constraints": { @@ -118,7 +122,8 @@ To facilitate the mapping of json spec property names to csv property names, th } ``` - flattens to: + translates to the csv stringified type object: + ```json "constraints.maxLength":{"type":"integer"} @@ -126,6 +131,8 @@ To facilitate the mapping of json spec property names to csv property names, th ``` 2. type `array` + The json spec type array property below: + ```json { "..more props..":"...", "standardsMappings": { @@ -146,7 +153,7 @@ To facilitate the mapping of json spec property names to csv property names, th }}} ``` - flattens to: + translates to the csv stringified type array property: ```json { "..more props..":"...", From 59e66a615bd1227788600abb30f97b3339a20ee8 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 11:48:20 -0600 Subject: [PATCH 37/72] recommendation on vlmd document file naming --- variable-level-metadata-schema/README.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index 7ccceeb..abe1a97 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -182,6 +182,11 @@ This provides a way to specify root level properties within vlmd csv documents f 1. specifying the schema version that represents the vlmd document (`schemaVersion`) 2. specifying other data dictionary level properties such as `standardsMappings[0].instrument` +### csv and json vlmd document file naming + +File names for json and csv translations of a vlmd document SHOULD +have the same stem name (eg `my-heal-dd.csv` and `my-heal-dd.json`) + ## Considerations Please use github issues for any additional considerations. See additional comments above. \ No newline at end of file From eaa91767f35e8d1329d836edea2b40106aca69f2 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 11:48:44 -0600 Subject: [PATCH 38/72] new line for version in vlmd schema markdown template --- .../docs/assets/templates/csvtemplate.md | 4 +++- .../docs/assets/templates/jsontemplate.md | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md b/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md index 7d403f6..aa8bc92 100644 --- a/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md +++ b/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md @@ -1,4 +1,6 @@ -# {{ schema.title }} _version {{ schema.version }}_ +# {{ schema.title }} + +_version {{ schema.version }}_ {{ schema.description }} diff --git a/variable-level-metadata-schema/docs/assets/templates/jsontemplate.md b/variable-level-metadata-schema/docs/assets/templates/jsontemplate.md index f9ba9a9..2afef89 100644 --- a/variable-level-metadata-schema/docs/assets/templates/jsontemplate.md +++ b/variable-level-metadata-schema/docs/assets/templates/jsontemplate.md @@ -1,4 +1,6 @@ -# {{ schema.title }} _version {{ schema.version }}_ +# {{ schema.title }} + +_version {{ schema.version }}_ {{ schema.description }} From 9f1a2b40bbab13ad52394f65d64c9d439b6b88ed Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 11:49:24 -0600 Subject: [PATCH 39/72] Run build with updates --- .../jsonschema-csvtemplate-fields.html | 20 ++++---- ...onschema-jsontemplate-data-dictionary.html | 30 ++++++------ .../jsonschema-csvtemplate-fields.md | 36 ++++++++++---- ...jsonschema-jsontemplate-data-dictionary.md | 14 ++++-- .../frictionless/csvtemplate/fields.json | 41 +++++++++------- .../jsonschema/csvtemplate/fields.json | 48 +++++++++++-------- .../schemas/jsonschema/data-dictionary.json | 28 ++++++----- .../templates/template_submission.csv | 2 +- 8 files changed, 132 insertions(+), 87 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index cc61b42..af44380 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -1,20 +1,20 @@ - HEAL Variable Level Metadata Fields

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
+ HEAL Variable Level Metadata Fields 

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

Must match regular expression: \d+\.\d+\.\d+

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
 
"Medical History"
 

Type: string

The name of a variable (i.e., field) as it appears in the data.


Example:

"gender_id"
 

Type: string

The human-readable title or label of the variable.


Example:

"Gender identity"
 

Type: string

An extended description of the variable. This could be the definition of a variable or the
question text (e.g., if a survey).


Examples:

"The participant's age at the time of study enrollment"
 
"What is the highest grade or level of school you have completed or the highest degree you have received?"
-

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
See here
for more information about appropriate format values by variable type.

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: string

Constrains possible values to a set of values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"1|2|3|4|5|6|7|8"
-
"White|Black or African American|American Indian or Alaska Native|Native Hawaiian or Other Pacific Islander|Asian|Some other race|Multiracial"
-

Type: string

A regular expression pattern the data MUST conform to.

Type: integer

Specifies the maximum value of a field (e.g., maximum -- or most
recent -- date, maximum integer etc). Note, this is different then
maxLength property.

Type: integer

Specifies the minimum value of a field.

Type: string

Variable value encodings provide a way to further annotate any value within a any variable type,
making values easier to understand.

Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms
only support numerical values. Encodings (and mappings) allow categorical values to be stored as
numerical values.

Additionally, as another use case, this field provides a way to
store categoricals that are stored as "short" labels (such as
abbreviations).

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
Examples:

"0=No|1=Yes"
-
"HW=Hello world|GBW=Good bye world|HM=Hi,Mike"
-

Type: boolean

Indicates whether a categorical variable is ordered. This variable is
relevant for variables that have an ordered relationship but not
necessarily a numerical relationship (e.g., Strongly disagree < Disagree
< Neutral < Agree).

Type: string

A list of missing values specific to a variable.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Missing|Skipped|No preference"
+

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
See here
for more information about appropriate format values by variable type.

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: string

Constrains possible values to a set of values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"1|2|3|4|5"
+
"Poor|Fair|Good|Very good|Excellent"
+

Type: string

A regular expression pattern the data MUST conform to.

Type: integer

Specifies the maximum value of a field (e.g., maximum -- or most
recent -- date, maximum integer etc). Note, this is different then
maxLength property.

Type: integer

Specifies the minimum value of a field.

Type: string

Variable value encodings provide a way to further annotate any value within a any variable type,
making values easier to understand.

Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms
only support numerical values. Encodings (and mappings) allow categorical values to be stored as
numerical values.

Additionally, as another use case, this field provides a way to
store categoricals that are stored as "short" labels (such as
abbreviations).

This field is intended to follow this pattern

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
Examples:

"1=Poor|2=Fair|3=Good|4=Very good|5=Excellent"
+
"HW=Hello world|GBW=Good bye world|HM=Hi, Mike"
+

Type: boolean

Indicates whether a categorical variable is ordered. This variable is
relevant for variables that have an ordered relationship but not
necessarily a numerical relationship (e.g., Strongly disagree < Disagree
< Neutral < Agree).

This field is intended to follow the ordering aspect of this [this pattern][this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering)

Type: string

A list of missing values specific to a variable.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Missing|Skipped|No preference"
 
"Missing"
-

Type: string

For boolean (true) variable (as defined in type field), this field allows
a physical string representation to be cast as true (increasing
readability of the field). It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Required|REQUIRED"
-
"Yes"
-

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Not required| NOT REQUIRED"
+

Type: string

For boolean (true) variable (as defined in type field), this field allows
a physical string representation to be cast as true (increasing
readability of the field). It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"required|Yes|Checked"
+
"required"
+

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Not required|NOT REQUIRED"
 
"No"
 

Type: stringFormat: uri

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 9efc379..95f6013 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -9,27 +9,29 @@ 1, 2, 3, - 4 + 4, + 5 ] -
[
-    "White",
-    "Black or African American",
-    "American Indian or Alaska Native",
-    "Native Hawaiian or Other Pacific Islander",
-    "Asian",
-    "Some other race",
-    "Multiracial"
+
[
+    "Poor",
+    "Fair",
+    "Good",
+    "Very good",
+    "Excellent"
 ]
-

Type: string

A regular expression pattern the data MUST conform to.

Type: integer

Specifies the maximum value of a field (e.g., maximum -- or most
recent -- date, maximum integer etc). Note, this is different then
maxLength property.

Type: integer

Specifies the minimum value of a field.

Type: object

Variable value encodings provide a way to further annotate any value within a any variable type,
making values easier to understand.

Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms
only support numerical values. Encodings (and mappings) allow categorical values to be stored as
numerical values.

Additionally, as another use case, this field provides a way to
store categoricals that are stored as "short" labels (such as
abbreviations).


Examples:

{
-    "0": "No",
-    "1": "Yes"
+

Type: string

A regular expression pattern the data MUST conform to.

Type: integer

Specifies the maximum value of a field (e.g., maximum -- or most
recent -- date, maximum integer etc). Note, this is different then
maxLength property.

Type: integer

Specifies the minimum value of a field.

Type: object

Variable value encodings provide a way to further annotate any value within a any variable type,
making values easier to understand.

Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms
only support numerical values. Encodings (and mappings) allow categorical values to be stored as
numerical values.

Additionally, as another use case, this field provides a way to
store categoricals that are stored as "short" labels (such as
abbreviations).

This field is intended to follow this pattern


Examples:

{
+    "1": "Poor",
+    "2": "Fair",
+    "3": "Good",
+    "4": "Very good",
+    "5": "Excellent"
 }
 
{
     "HW": "Hello world",
     "GBW": "Good bye world",
     "HM": "Hi, Mike"
 }
-

Type: boolean

Indicates whether a categorical variable is ordered. This variable is
relevant for variables that have an ordered relationship but not
necessarily a numerical relationship (e.g., Strongly disagree < Disagree
< Neutral < Agree).

Type: array

A list of missing values specific to a variable.


Examples:

[
+

Type: boolean

Indicates whether a categorical variable is ordered. This variable is
relevant for variables that have an ordered relationship but not
necessarily a numerical relationship (e.g., Strongly disagree < Disagree
< Neutral < Agree).

This field is intended to follow the ordering aspect of this [this pattern][this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering)

Type: array

A list of missing values specific to a variable.


Examples:

[
     "Missing",
     "Skipped",
     "No preference"
@@ -111,4 +113,4 @@
 ]
 

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index 7cbd151..a3e88a4 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -1,4 +1,7 @@ -# HEAL Variable Level Metadata Fields _version _ +# HEAL Variable Level Metadata Fields + +_version 0.2.0_ + Variable level metadata individual fields integrated into the variable level metadata object within the HEAL platform metadata service. @@ -11,7 +14,16 @@ metadata object within the HEAL platform metadata service. `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables) -## Properties +## Properties (i.e., fields or variables) + + +**`schemaVersion`** _(string)_ + The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) + +NOTE: This is NOT for versioning of each indiviual data dictionary instance. +Rather, it is the +version of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance +version. **`section`** _(string)_ @@ -100,12 +112,12 @@ Examples: ``` - 1|2|3|4|5|6|7|8 + 1|2|3|4|5 ``` ``` - White|Black or African American|American Indian or Alaska Native|Native Hawaiian or Other Pacific Islander|Asian|Some other race|Multiracial + Poor|Fair|Good|Very good|Excellent ``` @@ -136,16 +148,18 @@ Additionally, as another use case, this field provides a way to store categoricals that are stored as "short" labels (such as abbreviations). +This field is intended to follow [this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering) + Examples: ``` - 0=No|1=Yes + 1=Poor|2=Fair|3=Good|4=Very good|5=Excellent ``` ``` - HW=Hello world|GBW=Good bye world|HM=Hi,Mike + HW=Hello world|GBW=Good bye world|HM=Hi, Mike ``` @@ -155,6 +169,8 @@ relevant for variables that have an ordered relationship but not necessarily a numerical relationship (e.g., Strongly disagree < Disagree < Neutral < Agree). +This field is intended to follow the ordering aspect of this [this pattern][this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering) + **`missingValues`** _(string)_ A list of missing values specific to a variable. @@ -181,12 +197,12 @@ Examples: ``` - Required|REQUIRED + required|Yes|Checked ``` ``` - Yes + required ``` @@ -199,7 +215,7 @@ Examples: ``` - Not required| NOT REQUIRED + Not required|NOT REQUIRED ``` @@ -269,7 +285,7 @@ Examples: The id locating the individual mapping within the given source. -# End of schema - Additional Property information +## End of schema - Additional Property information ## `type` enum definitions: diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index ef265d9..3e861f0 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -1,4 +1,6 @@ -# Variable Level Metadata (Data Dictionaries) _version 0.2.0_ +# Variable Level Metadata (Data Dictionaries) + +_version 0.2.0_ This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries @@ -135,13 +137,13 @@ for more information about appropriate `format` values by variable `type`. ```json - [1, 2, 3, 4] + [1, 2, 3, 4, 5] ``` ```json - ['White', 'Black or African American', 'American Indian or Alaska Native', 'Native Hawaiian or Other Pacific Islander', 'Asian', 'Some other race', 'Multiracial'] + ['Poor', 'Fair', 'Good', 'Very good', 'Excellent'] ``` @@ -176,12 +178,14 @@ Additionally, as another use case, this field provides a way to store categoricals that are stored as "short" labels (such as abbreviations). +This field is intended to follow [this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering) + Examples: ```json - {'0': 'No', '1': 'Yes'} + {'1': 'Poor', '2': 'Fair', '3': 'Good', '4': 'Very good', '5': 'Excellent'} ``` @@ -197,6 +201,8 @@ relevant for variables that have an ordered relationship but not necessarily a numerical relationship (e.g., Strongly disagree < Disagree < Neutral < Agree). +This field is intended to follow the ordering aspect of this [this pattern][this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering) + **`missingValues`** _(array)_ A list of missing values specific to a variable. diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index e95a661..83d2768 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -3,6 +3,14 @@ "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "title": "HEAL Variable Level Metadata Fields", "fields": [ + { + "name": "schemaVersion", + "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n", + "type": "string", + "constraints": { + "pattern": "\\d+\\.\\d+\\.\\d+" + } + }, { "name": "section", "description": "The section, form, survey instrument, set of measures or other broad category used \nto group variables. Previously called \"module.\"\n", @@ -56,17 +64,17 @@ "constraints": { "enum": [ "integer", + "number", "string", - "date", - "time", "duration", - "datetime", "yearmonth", - "number", - "geopoint", - "year", "boolean", - "any" + "time", + "year", + "geopoint", + "date", + "any", + "datetime" ] } }, @@ -87,8 +95,8 @@ "description": "Constrains possible values to a set of values.\n", "title": "Variable Possible Values", "examples": [ - "1|2|3|4|5|6|7|8", - "White|Black or African American|American Indian or Alaska Native|Native Hawaiian or Other Pacific Islander|Asian|Some other race|Multiracial" + "1|2|3|4|5", + "Poor|Fair|Good|Very good|Excellent" ], "type": "string", "constraints": { @@ -115,11 +123,11 @@ }, { "name": "enumLabels", - "description": "Variable value encodings provide a way to further annotate any value within a any variable type,\nmaking values easier to understand. \n\n\nMany analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms\nonly support numerical values. Encodings (and mappings) allow categorical values to be stored as\nnumerical values.\n\nAdditionally, as another use case, this field provides a way to\nstore categoricals that are stored as \"short\" labels (such as\nabbreviations).\n", + "description": "Variable value encodings provide a way to further annotate any value within a any variable type,\nmaking values easier to understand. \n\n\nMany analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms\nonly support numerical values. Encodings (and mappings) allow categorical values to be stored as\nnumerical values.\n\nAdditionally, as another use case, this field provides a way to\nstore categoricals that are stored as \"short\" labels (such as\nabbreviations).\n\nThis field is intended to follow [this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering)\n", "title": "Variable Value Encodings (i.e., mappings; value labels)", "examples": [ - "0=No|1=Yes", - "HW=Hello world|GBW=Good bye world|HM=Hi,Mike" + "1=Poor|2=Fair|3=Good|4=Very good|5=Excellent", + "HW=Hello world|GBW=Good bye world|HM=Hi, Mike" ], "type": "string", "constraints": { @@ -128,7 +136,7 @@ }, { "name": "enumOrdered", - "description": "Indicates whether a categorical variable is ordered. This variable is\nrelevant for variables that have an ordered relationship but not\nnecessarily a numerical relationship (e.g., Strongly disagree < Disagree\n< Neutral < Agree).\n", + "description": "Indicates whether a categorical variable is ordered. This variable is\nrelevant for variables that have an ordered relationship but not\nnecessarily a numerical relationship (e.g., Strongly disagree < Disagree\n< Neutral < Agree).\n\nThis field is intended to follow the ordering aspect of this [this pattern][this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering)\n", "title": "An ordered variable", "type": "boolean" }, @@ -148,9 +156,10 @@ { "name": "trueValues", "description": "For boolean (true) variable (as defined in type field), this field allows\na physical string representation to be cast as true (increasing\nreadability of the field). It can include one or more values.\n", + "title": "Boolean True Value Labels", "examples": [ - "Required|REQUIRED", - "Yes" + "required|Yes|Checked", + "required" ], "type": "string", "constraints": { @@ -162,7 +171,7 @@ "description": "For boolean (false) variable (as defined in type field), this field allows\na physical string representation to be cast as false (increasing\nreadability of the field) that is not a standard false value. It can include one or more values.\n", "title": "Boolean False Value Labels", "examples": [ - "Not required| NOT REQUIRED", + "Not required|NOT REQUIRED", "No" ], "type": "string", diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index 93717bc..7ff31b3 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -1,4 +1,5 @@ { + "version": "0.2.0", "title": "HEAL Variable Level Metadata Fields", "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "type": "object", @@ -7,6 +8,11 @@ "description" ], "properties": { + "schemaVersion": { + "type": "string", + "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n", + "pattern": "\\d+\\.\\d+\\.\\d+" + }, "section": { "type": "string", "title": "Section", @@ -74,14 +80,14 @@ "description": "Indicates the maximum length of an iterable (e.g., array, string, or\nobject). For example, if 'Hello World' is the longest value of a\ncategorical variable, this would be a maxLength of 11.\n" }, "constraints.enum": { + "type": "string", "title": "Variable Possible Values", "description": "Constrains possible values to a set of values.\n", - "type": "string", - "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$", "examples": [ - "1|2|3|4|5|6|7|8", - "White|Black or African American|American Indian or Alaska Native|Native Hawaiian or Other Pacific Islander|Asian|Some other race|Multiracial" - ] + "1|2|3|4|5", + "Poor|Fair|Good|Very good|Excellent" + ], + "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" }, "constraints.pattern": { "type": "string", @@ -100,17 +106,17 @@ }, "enumLabels": { "title": "Variable Value Encodings (i.e., mappings; value labels)", - "description": "Variable value encodings provide a way to further annotate any value within a any variable type,\nmaking values easier to understand. \n\n\nMany analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms\nonly support numerical values. Encodings (and mappings) allow categorical values to be stored as\nnumerical values.\n\nAdditionally, as another use case, this field provides a way to\nstore categoricals that are stored as \"short\" labels (such as\nabbreviations).\n", + "description": "Variable value encodings provide a way to further annotate any value within a any variable type,\nmaking values easier to understand. \n\n\nMany analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms\nonly support numerical values. Encodings (and mappings) allow categorical values to be stored as\nnumerical values.\n\nAdditionally, as another use case, this field provides a way to\nstore categoricals that are stored as \"short\" labels (such as\nabbreviations).\n\nThis field is intended to follow [this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering)\n", "type": "string", - "pattern": "^(?:.*?=.*?(?:\\||$))+$", "examples": [ - "0=No|1=Yes", - "HW=Hello world|GBW=Good bye world|HM=Hi,Mike" - ] + "1=Poor|2=Fair|3=Good|4=Very good|5=Excellent", + "HW=Hello world|GBW=Good bye world|HM=Hi, Mike" + ], + "pattern": "^(?:.*?=.*?(?:\\||$))+$" }, "enumOrdered": { "title": "An ordered variable", - "description": "Indicates whether a categorical variable is ordered. This variable is\nrelevant for variables that have an ordered relationship but not\nnecessarily a numerical relationship (e.g., Strongly disagree < Disagree\n< Neutral < Agree).\n", + "description": "Indicates whether a categorical variable is ordered. This variable is\nrelevant for variables that have an ordered relationship but not\nnecessarily a numerical relationship (e.g., Strongly disagree < Disagree\n< Neutral < Agree).\n\nThis field is intended to follow the ordering aspect of this [this pattern][this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering)\n", "type": "boolean" }, "missingValues": { @@ -124,23 +130,27 @@ "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" }, "trueValues": { - "type": "string", - "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$", + "title": "Boolean True Value Labels", "description": "For boolean (true) variable (as defined in type field), this field allows\na physical string representation to be cast as true (increasing\nreadability of the field). It can include one or more values.\n", + "type": "string", + "items": { + "type": "string" + }, "examples": [ - "Required|REQUIRED", - "Yes" - ] + "required|Yes|Checked", + "required" + ], + "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" }, "falseValues": { "title": "Boolean False Value Labels", "description": "For boolean (false) variable (as defined in type field), this field allows\na physical string representation to be cast as false (increasing\nreadability of the field) that is not a standard false value. It can include one or more values.\n", "type": "string", - "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$", "examples": [ - "Not required| NOT REQUIRED", + "Not required|NOT REQUIRED", "No" - ] + ], + "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" }, "standardsMappings[0].instrument.url": { "type": "string", diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 8300c72..3114517 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -138,24 +138,23 @@ "description": "Indicates the maximum length of an iterable (e.g., array, string, or\nobject). For example, if 'Hello World' is the longest value of a\ncategorical variable, this would be a maxLength of 11.\n" }, "enum": { + "type": "array", "title": "Variable Possible Values", "description": "Constrains possible values to a set of values.\n", - "type": "array", "examples": [ [ 1, 2, 3, - 4 + 4, + 5 ], [ - "White", - "Black or African American", - "American Indian or Alaska Native", - "Native Hawaiian or Other Pacific Islander", - "Asian", - "Some other race", - "Multiracial" + "Poor", + "Fair", + "Good", + "Very good", + "Excellent" ] ] }, @@ -178,12 +177,15 @@ }, "enumLabels": { "title": "Variable Value Encodings (i.e., mappings; value labels)", - "description": "Variable value encodings provide a way to further annotate any value within a any variable type,\nmaking values easier to understand. \n\n\nMany analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms\nonly support numerical values. Encodings (and mappings) allow categorical values to be stored as\nnumerical values.\n\nAdditionally, as another use case, this field provides a way to\nstore categoricals that are stored as \"short\" labels (such as\nabbreviations).\n", + "description": "Variable value encodings provide a way to further annotate any value within a any variable type,\nmaking values easier to understand. \n\n\nMany analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms\nonly support numerical values. Encodings (and mappings) allow categorical values to be stored as\nnumerical values.\n\nAdditionally, as another use case, this field provides a way to\nstore categoricals that are stored as \"short\" labels (such as\nabbreviations).\n\nThis field is intended to follow [this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering)\n", "type": "object", "examples": [ { - "0": "No", - "1": "Yes" + "1": "Poor", + "2": "Fair", + "3": "Good", + "4": "Very good", + "5": "Excellent" }, { "HW": "Hello world", @@ -194,7 +196,7 @@ }, "enumOrdered": { "title": "An ordered variable", - "description": "Indicates whether a categorical variable is ordered. This variable is\nrelevant for variables that have an ordered relationship but not\nnecessarily a numerical relationship (e.g., Strongly disagree < Disagree\n< Neutral < Agree).\n", + "description": "Indicates whether a categorical variable is ordered. This variable is\nrelevant for variables that have an ordered relationship but not\nnecessarily a numerical relationship (e.g., Strongly disagree < Disagree\n< Neutral < Agree).\n\nThis field is intended to follow the ordering aspect of this [this pattern][this pattern](https://specs.frictionlessdata.io/patterns/#table-schema-enum-labels-and-ordering)\n", "type": "boolean" }, "missingValues": { diff --git a/variable-level-metadata-schema/templates/template_submission.csv b/variable-level-metadata-schema/templates/template_submission.csv index d9f78d9..62d26d1 100644 --- a/variable-level-metadata-schema/templates/template_submission.csv +++ b/variable-level-metadata-schema/templates/template_submission.csv @@ -1 +1 @@ -section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id,relatedConcepts[0].url,relatedConcepts[0].type,relatedConcepts[0].label,relatedConcepts[0].source,relatedConcepts[0].id \ No newline at end of file +schemaVersion,section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id,relatedConcepts[0].url,relatedConcepts[0].type,relatedConcepts[0].label,relatedConcepts[0].source,relatedConcepts[0].id \ No newline at end of file From 0cd6c80fc8a7e246ea78984c2f5ed9883fb66e52 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 11:56:57 -0600 Subject: [PATCH 40/72] Note on schemaVersion cascading in csv files --- .../schemas/dictionary/data-dictionary.yaml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index f9b368c..c0046d6 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -21,6 +21,10 @@ properties: Rather, it is the version of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance version. + + If generating a vlmd document as a csv file, include this version in + every row/record to indicate this is a schema level property + (not applicable for the json version as this property is already at the schema/root level) pattern: \d+\.\d+\.\d+ version: # TODO: think about having a version text/message and id (akin to a git commit) type: string From 15c3bde3febd0f0d4846aaf065e8b8cdcfe3c493 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 11:57:32 -0600 Subject: [PATCH 41/72] Update build --- .../jsonschema-csvtemplate-fields.html | 4 ++-- .../jsonschema-jsontemplate-data-dictionary.html | 4 ++-- .../jsonschema-csvtemplate-fields.md | 4 ++++ .../jsonschema-jsontemplate-data-dictionary.md | 4 ++++ .../schemas/frictionless/csvtemplate/fields.json | 16 ++++++++-------- .../schemas/jsonschema/csvtemplate/fields.json | 2 +- .../schemas/jsonschema/data-dictionary.json | 2 +- 7 files changed, 22 insertions(+), 14 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index af44380..fd7b0b3 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -1,4 +1,4 @@ - HEAL Variable Level Metadata Fields

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

Must match regular expression: \d+\.\d+\.\d+

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
+ HEAL Variable Level Metadata Fields 

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
 
"Medical History"
 

Type: string

The name of a variable (i.e., field) as it appears in the data.


Example:

"gender_id"
@@ -17,4 +17,4 @@
 
"No"
 

Type: stringFormat: uri

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 95f6013..15a0303 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -1,4 +1,4 @@ - Variable Level Metadata (Data Dictionaries)

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

Must match regular expression: \d+\.\d+\.\d+

Type: string

The specified individual data dictionary instance version.

Type: array

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
+ Variable Level Metadata (Data Dictionaries) 

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+

Type: string

The specified individual data dictionary instance version.

Type: array

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
 
"Medical History"
 

Type: string

The name of a variable (i.e., field) as it appears in the data.


Example:

"gender_id"
@@ -113,4 +113,4 @@
 ]
 

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
 

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index a3e88a4..7e7611b 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -25,6 +25,10 @@ Rather, it is the version of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance version. +If generating a vlmd document as a csv file, include this version in +every row/record to indicate this is a schema level property +(not applicable for the json version as this property is already at the schema/root level) + **`section`** _(string)_ The section, form, survey instrument, set of measures or other broad category used diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index 3e861f0..ba120b5 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -16,6 +16,10 @@ Rather, it is the version of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance version. +If generating a vlmd document as a csv file, include this version in +every row/record to indicate this is a schema level property +(not applicable for the json version as this property is already at the schema/root level) + ## `version` _(string)_ The specified individual data dictionary instance version. ## `standardsMappings` _(array)_ diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 83d2768..5422f38 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -5,7 +5,7 @@ "fields": [ { "name": "schemaVersion", - "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n", + "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n\nIf generating a vlmd document as a csv file, include this version in \nevery row/record to indicate this is a schema level property \n(not applicable for the json version as this property is already at the schema/root level)\n", "type": "string", "constraints": { "pattern": "\\d+\\.\\d+\\.\\d+" @@ -63,18 +63,18 @@ "type": "string", "constraints": { "enum": [ + "time", + "date", + "geopoint", + "year", + "datetime", "integer", "number", "string", "duration", - "yearmonth", - "boolean", - "time", - "year", - "geopoint", - "date", "any", - "datetime" + "boolean", + "yearmonth" ] } }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index 7ff31b3..ea2e173 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -10,7 +10,7 @@ "properties": { "schemaVersion": { "type": "string", - "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n", + "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n\nIf generating a vlmd document as a csv file, include this version in \nevery row/record to indicate this is a schema level property \n(not applicable for the json version as this property is already at the schema/root level)\n", "pattern": "\\d+\\.\\d+\\.\\d+" }, "section": { diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 3114517..6621be6 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -18,7 +18,7 @@ }, "schemaVersion": { "type": "string", - "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n", + "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n\nIf generating a vlmd document as a csv file, include this version in \nevery row/record to indicate this is a schema level property \n(not applicable for the json version as this property is already at the schema/root level)\n", "pattern": "\\d+\\.\\d+\\.\\d+" }, "version": { From f26f55239ce44db64a8cfe660b2f871caa5938b3 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 12:45:44 -0600 Subject: [PATCH 42/72] Add examples to individual standardsMappings props --- .../schemas/dictionary/definitions.yaml | 29 ++++++++++++++----- 1 file changed, 22 insertions(+), 7 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml index 6a58eed..e02d791 100644 --- a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml @@ -23,15 +23,24 @@ standardsMappingsInstrumentObject: properties: url: + title: Url + description: "" type: string format: uri + examples: + - "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx" source: type: string enum: ["heal-cde"] title: type: string + examples: + - Adult demographics + - adult-demographics id: type: string + examples: + - "1020" rootStandardsMappingsItem: type: array @@ -73,7 +82,7 @@ fieldStandardsMappingsItem: "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx", "source": "heal-cde", "title": "adult-demographics", - "id": + "id": "1020" }, "item": { "url": "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE", @@ -93,7 +102,7 @@ fieldStandardsMappingsItem: { "instrument": { "source": "heal-cde", - "title": "adult-demographics" + "title": "Adult demographics" } } ] @@ -136,7 +145,7 @@ fieldStandardsMappingsItem: { "instrument": { "source": "heal-cde", - "title": "adult-demographics" + "title": "Adult demographics" }, "item": { "source": "CDISC", @@ -167,19 +176,25 @@ fieldStandardsMappingsItem: url: title: Standards Mapping - Url description: | - The url that links out to the published, standardized mapping. + The url that links out to the published, standardized mapping of a variable (e.g., common data element) type: string format: uri examples: - - https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI + - "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE" source: title: Standard Mapping - Source description: | - The source of the standardized variable. + The source of the standardized variable. Note, this property is required if + an id is specified. + examples: + - "CDISC" type: string id: title: Standard Mapping - Id type: string description: | - The id locating the individual mapping within the given source. + The id locating the individual mapping within the given source. Note, the `standardsMapping[\d+].source` property is required if + this property is specified. + examples: + - "C74457" From dfee8cf775413f5fe05ff3a306938bcaff42fbb4 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 12:46:03 -0600 Subject: [PATCH 43/72] Update build --- .../jsonschema-csvtemplate-fields.html | 12 +++-- ...onschema-jsontemplate-data-dictionary.html | 24 ++++++--- .../jsonschema-csvtemplate-fields.md | 50 ++++++++++++++++-- ...jsonschema-jsontemplate-data-dictionary.md | 6 +-- .../frictionless/csvtemplate/fields.json | 39 ++++++++++---- .../jsonschema/csvtemplate/fields.json | 32 +++++++++--- .../schemas/jsonschema/data-dictionary.json | 52 +++++++++++++++---- 7 files changed, 169 insertions(+), 46 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index fd7b0b3..804b0c1 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -15,6 +15,12 @@
"required"
 

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Not required|NOT REQUIRED"
 
"No"
-

Type: stringFormat: uri

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
-

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: stringFormat: uri

Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
+

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
+
"adult-demographics"
+

Type: string

Example:

"1020"
+

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
+

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
+

Type: string

The id locating the individual mapping within the given source. Note, the standardsMapping[\d+].source property is required if
this property is specified.


Example:

"C74457"
+

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
+

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 15a0303..63cb6f6 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -1,4 +1,8 @@ - Variable Level Metadata (Data Dictionaries)

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+

Type: string

The specified individual data dictionary instance version.

Type: array

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
+ Variable Level Metadata (Data Dictionaries) 

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+

Type: string

The specified individual data dictionary instance version.

Type: array

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: stringFormat: uri

Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
+

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
+
"adult-demographics"
+

Type: string

Example:

"1020"
+

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
 
"Medical History"
 

Type: string

The name of a variable (i.e., field) as it appears in the data.


Example:

"gender_id"
@@ -60,7 +64,7 @@
             "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx",
             "source": "heal-cde",
             "title": "adult-demographics",
-            "id": <drupal id here>
+            "id": "1020"
         },
         "item": {
             "url": "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE",
@@ -73,7 +77,7 @@
     {
         "instrument": {
             "source": "heal-cde",
-            "title": "adult-demographics"
+            "title": "Adult demographics"
         }
     }
 ]
@@ -97,7 +101,7 @@
     {
         "instrument": {
             "source": "heal-cde",
-            "title": "adult-demographics"
+            "title": "Adult demographics"
         },
         "item": {
             "source": "CDISC",
@@ -111,6 +115,12 @@
         }
     }
 ]
-

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping.


Example:

"https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI"
-

Type: string

The source of the standardized variable.

Type: string

The id locating the individual mapping within the given source.

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: stringFormat: uri

Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
+

Type: string

Examples:

"Adult demographics"
+
"adult-demographics"
+

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
+

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
+

Type: string

The id locating the individual mapping within the given source. Note, the standardsMapping[\d+].source property is required if
this property is specified.


Example:

"C74457"
+

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
+

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index 7e7611b..4b5f62f 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -230,6 +230,13 @@ Examples: **`standardsMappings[0].instrument.url`** _(string)_ +Examples: + + +``` + https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx + +``` **`standardsMappings[0].instrument.source`** _(string)_ @@ -237,29 +244,64 @@ Must be one of: `heal-cde` **`standardsMappings[0].instrument.title`** _(string)_ +Examples: + + +``` + Adult demographics + +``` + +``` + adult-demographics + +``` **`standardsMappings[0].instrument.id`** _(string)_ +Examples: + + +``` + 1020 + +``` **`standardsMappings[0].item.url`** _(string)_ - The url that links out to the published, standardized mapping. + The url that links out to the published, standardized mapping of a variable (e.g., common data element) Examples: ``` - https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI + https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE ``` **`standardsMappings[0].item.source`** _(string)_ - The source of the standardized variable. + The source of the standardized variable. Note, this property is required if +an id is specified. +Examples: + + +``` + CDISC + +``` **`standardsMappings[0].item.id`** _(string)_ - The id locating the individual mapping within the given source. + The id locating the individual mapping within the given source. Note, the `standardsMapping[\d+].source` property is required if +this property is specified. + +Examples: +``` + C74457 + +``` + **`relatedConcepts[0].url`** _(string)_ The url that links out to the published, standardized concept. diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index ba120b5..3ca9625 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -283,7 +283,7 @@ __**All Fields Mapped (Both Instrument and Item)**__ "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx", "source": "heal-cde", "title": "adult-demographics", - "id": + "id": "1020" }, "item": { "url": "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE", @@ -303,7 +303,7 @@ In this scenario, especially as CDE variables do not have associated CDISC ids l { "instrument": { "source": "heal-cde", - "title": "adult-demographics" + "title": "Adult demographics" } } ] @@ -346,7 +346,7 @@ Two separate records. If desired, multiple standard mappings can be entered, say { "instrument": { "source": "heal-cde", - "title": "adult-demographics" + "title": "Adult demographics" }, "item": { "source": "CDISC", diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 5422f38..b51777b 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -63,18 +63,18 @@ "type": "string", "constraints": { "enum": [ - "time", - "date", - "geopoint", + "string", "year", + "yearmonth", + "date", "datetime", "integer", + "boolean", "number", - "string", "duration", - "any", - "boolean", - "yearmonth" + "geopoint", + "time", + "any" ] } }, @@ -181,6 +181,10 @@ }, { "name": "standardsMappings[0].instrument.url", + "title": "Url", + "examples": [ + "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx" + ], "type": "string" }, { @@ -194,31 +198,44 @@ }, { "name": "standardsMappings[0].instrument.title", + "examples": [ + "Adult demographics", + "adult-demographics" + ], "type": "string" }, { "name": "standardsMappings[0].instrument.id", + "examples": [ + "1020" + ], "type": "string" }, { "name": "standardsMappings[0].item.url", - "description": "The url that links out to the published, standardized mapping.\n", + "description": "The url that links out to the published, standardized mapping of a variable (e.g., common data element)\n", "title": "Standards Mapping - Url", "examples": [ - "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" + "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE" ], "type": "string" }, { "name": "standardsMappings[0].item.source", - "description": "The source of the standardized variable.\n", + "description": "The source of the standardized variable. Note, this property is required if \nan id is specified.\n", "title": "Standard Mapping - Source", + "examples": [ + "CDISC" + ], "type": "string" }, { "name": "standardsMappings[0].item.id", - "description": "The id locating the individual mapping within the given source.\n", + "description": "The id locating the individual mapping within the given source. Note, the `standardsMapping[\\d+].source` property is required if \nthis property is specified.\n", "title": "Standard Mapping - Id", + "examples": [ + "C74457" + ], "type": "string" }, { diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index ea2e173..0043666 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -153,8 +153,13 @@ "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" }, "standardsMappings[0].instrument.url": { + "title": "Url", + "description": "", "type": "string", - "format": "uri" + "format": "uri", + "examples": [ + "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx" + ] }, "standardsMappings[0].instrument.source": { "type": "string", @@ -163,29 +168,42 @@ ] }, "standardsMappings[0].instrument.title": { - "type": "string" + "type": "string", + "examples": [ + "Adult demographics", + "adult-demographics" + ] }, "standardsMappings[0].instrument.id": { - "type": "string" + "type": "string", + "examples": [ + "1020" + ] }, "standardsMappings[0].item.url": { "title": "Standards Mapping - Url", - "description": "The url that links out to the published, standardized mapping.\n", + "description": "The url that links out to the published, standardized mapping of a variable (e.g., common data element)\n", "type": "string", "format": "uri", "examples": [ - "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" + "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE" ] }, "standardsMappings[0].item.source": { "title": "Standard Mapping - Source", - "description": "The source of the standardized variable.\n", + "description": "The source of the standardized variable. Note, this property is required if \nan id is specified.\n", + "examples": [ + "CDISC" + ], "type": "string" }, "standardsMappings[0].item.id": { "title": "Standard Mapping - Id", "type": "string", - "description": "The id locating the individual mapping within the given source.\n" + "description": "The id locating the individual mapping within the given source. Note, the `standardsMapping[\\d+].source` property is required if \nthis property is specified.\n", + "examples": [ + "C74457" + ] }, "relatedConcepts[0].url": { "title": "Related Concepts - Url", diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 6621be6..083b7c7 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -37,8 +37,13 @@ "description": "A standardized set of items which encompass \na variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n\n\n!!! note \"NOTE\"\n\n If information is present at both the root and the field level, \n then the information at the field level would take precedence (i.e., it would cascade).\n", "properties": { "url": { + "title": "Url", + "description": "", "type": "string", - "format": "uri" + "format": "uri", + "examples": [ + "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx" + ] }, "source": { "type": "string", @@ -47,10 +52,17 @@ ] }, "title": { - "type": "string" + "type": "string", + "examples": [ + "Adult demographics", + "adult-demographics" + ] }, "id": { - "type": "string" + "type": "string", + "examples": [ + "1020" + ] } } } @@ -248,7 +260,7 @@ }, "standardsMappings": { "type": "array", - "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \"1020\"\n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", + "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \"1020\"\n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"Adult demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \"1020\"\n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"Adult demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", "items": { "type": "object", "properties": { @@ -258,8 +270,13 @@ "description": "A standardized set of items which encompass \na variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n\n\n!!! note \"NOTE\"\n\n If information is present at both the root and the field level, \n then the information at the field level would take precedence (i.e., it would cascade).\n", "properties": { "url": { + "title": "Url", + "description": "", "type": "string", - "format": "uri" + "format": "uri", + "examples": [ + "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx" + ] }, "source": { "type": "string", @@ -268,10 +285,17 @@ ] }, "title": { - "type": "string" + "type": "string", + "examples": [ + "Adult demographics", + "adult-demographics" + ] }, "id": { - "type": "string" + "type": "string", + "examples": [ + "1020" + ] } } }, @@ -282,22 +306,28 @@ "properties": { "url": { "title": "Standards Mapping - Url", - "description": "The url that links out to the published, standardized mapping.\n", + "description": "The url that links out to the published, standardized mapping of a variable (e.g., common data element)\n", "type": "string", "format": "uri", "examples": [ - "https://cde.nlm.nih.gov/deView?tinyId=XyuSGdTTI" + "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE" ] }, "source": { "title": "Standard Mapping - Source", - "description": "The source of the standardized variable.\n", + "description": "The source of the standardized variable. Note, this property is required if \nan id is specified.\n", + "examples": [ + "CDISC" + ], "type": "string" }, "id": { "title": "Standard Mapping - Id", "type": "string", - "description": "The id locating the individual mapping within the given source.\n" + "description": "The id locating the individual mapping within the given source. Note, the `standardsMapping[\\d+].source` property is required if \nthis property is specified.\n", + "examples": [ + "C74457" + ] } } } From 0ab4677b551c1f1ac6467f782d6f084d2384d76f Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 14:18:01 -0600 Subject: [PATCH 44/72] Definitions of standardsMappings instrument properties --- .../schemas/dictionary/definitions.yaml | 22 +++++++++++++++---- 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml index e02d791..b005dea 100644 --- a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml @@ -24,23 +24,37 @@ standardsMappingsInstrumentObject: properties: url: title: Url - description: "" + description: | + A url (e.g., link, address) to a file or other resource containing the instrument, or + a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) + or the individual variable (if at the field level). type: string format: uri examples: - "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx" source: type: string + title: Source + description: | + An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository) + containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) + or the individual variable (if at the field level). enum: ["heal-cde"] title: type: string + title: Title examples: - Adult demographics - adult-demographics id: type: string + title: Identifier + description: | + A code or other string that identifies the instrument within the source. + This should always be from the source's formal, standardized identification system + examples: - - "1020" + - "5141" rootStandardsMappingsItem: type: array @@ -82,7 +96,7 @@ fieldStandardsMappingsItem: "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx", "source": "heal-cde", "title": "adult-demographics", - "id": "1020" + "id": "5141" }, "item": { "url": "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE", @@ -115,7 +129,7 @@ fieldStandardsMappingsItem: { "instrument": { "source": "heal-cde", - "id": "1020" + "id": "5141" } } ] From bc05b82011dc592c0916627fe04d1347d7eba537 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 14:19:13 -0600 Subject: [PATCH 45/72] Update build --- .../jsonschema-csvtemplate-fields.html | 8 +++---- ...onschema-jsontemplate-data-dictionary.html | 18 +++++++------- .../jsonschema-csvtemplate-fields.md | 16 +++++++++---- ...jsonschema-jsontemplate-data-dictionary.md | 4 ++-- .../examples/valid/template_submission.csv | 16 ++++++------- .../valid/template_submission_minimal.csv | 16 ++++++------- .../frictionless/csvtemplate/fields.json | 24 ++++++++++++------- .../jsonschema/csvtemplate/fields.json | 9 +++++-- .../schemas/jsonschema/data-dictionary.json | 20 ++++++++++++---- 9 files changed, 80 insertions(+), 51 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 804b0c1..b3e84ee 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -15,12 +15,12 @@
"required"
 

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Not required|NOT REQUIRED"
 
"No"
-

Type: stringFormat: uri

Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
-

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
+

Type: stringFormat: uri

A url (e.g., link, address) to a file or other resource containing the instrument, or
a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).


Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
+

Type: enum (of string)

An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)
containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
 
"adult-demographics"
-

Type: string

Example:

"1020"
+

Type: string

A code or other string that identifies the instrument within the source.
This should always be from the source's formal, standardized identification system


Example:

"5141"
 

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source. Note, the standardsMapping[\d+].source property is required if
this property is specified.


Example:

"C74457"
 

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 63cb6f6..9c1884d 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -1,7 +1,7 @@ - Variable Level Metadata (Data Dictionaries)

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+

Type: string

The specified individual data dictionary instance version.

Type: array

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: stringFormat: uri

Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
-

Type: enum (of string)

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
+ Variable Level Metadata (Data Dictionaries) 

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+

Type: string

The specified individual data dictionary instance version.

Type: array

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: stringFormat: uri

A url (e.g., link, address) to a file or other resource containing the instrument, or
a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).


Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
+

Type: enum (of string)

An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)
containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
 
"adult-demographics"
-

Type: string

Example:

"1020"
+

Type: string

A code or other string that identifies the instrument within the source.
This should always be from the source's formal, standardized identification system


Example:

"5141"
 

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
 
"Medical History"
@@ -64,7 +64,7 @@
             "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx",
             "source": "heal-cde",
             "title": "adult-demographics",
-            "id": "1020"
+            "id": "5141"
         },
         "item": {
             "url": "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE",
@@ -85,7 +85,7 @@
     {
         "instrument": {
             "source": "heal-cde",
-            "id": "1020"
+            "id": "5141"
         }
     }
 ]
@@ -115,12 +115,12 @@
         }
     }
 ]
-

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: stringFormat: uri

Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
-

Type: string

Examples:

"Adult demographics"
+

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: stringFormat: uri

A url (e.g., link, address) to a file or other resource containing the instrument, or
a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).


Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
+

Type: enum (of string)

An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)
containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
 
"adult-demographics"
-

Type: string

A code or other string that identifies the instrument within the source.
This should always be from the source's formal, standardized identification system


Example:

"5141"
 

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source. Note, the standardsMapping[\d+].source property is required if
this property is specified.


Example:

"C74457"
 

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index 4b5f62f..3d41c24 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -229,7 +229,10 @@ Examples: ``` **`standardsMappings[0].instrument.url`** _(string)_ - + A url (e.g., link, address) to a file or other resource containing the instrument, or +a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) +or the individual variable (if at the field level). + Examples: @@ -239,7 +242,10 @@ Examples: ``` **`standardsMappings[0].instrument.source`** _(string)_ - + An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository) +containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) +or the individual variable (if at the field level). + Must be one of: `heal-cde` **`standardsMappings[0].instrument.title`** _(string)_ @@ -258,12 +264,14 @@ Examples: ``` **`standardsMappings[0].instrument.id`** _(string)_ - + A code or other string that identifies the instrument within the source. +This should always be from the source's formal, standardized identification system + Examples: ``` - 1020 + 5141 ``` diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index 3ca9625..f304967 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -283,7 +283,7 @@ __**All Fields Mapped (Both Instrument and Item)**__ "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx", "source": "heal-cde", "title": "adult-demographics", - "id": "1020" + "id": "5141" }, "item": { "url": "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE", @@ -316,7 +316,7 @@ __**Only Instrument ID of HEAL CDE Mapped**__ { "instrument": { "source": "heal-cde", - "id": "1020" + "id": "5141" } } ] diff --git a/variable-level-metadata-schema/examples/valid/template_submission.csv b/variable-level-metadata-schema/examples/valid/template_submission.csv index 6aa2ee5..f199e77 100644 --- a/variable-level-metadata-schema/examples/valid/template_submission.csv +++ b/variable-level-metadata-schema/examples/valid/template_submission.csv @@ -1,8 +1,8 @@ -section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,repo_link,standardsMappings.type,standardsMappings.label,standardsMappings.url,standardsMappings.source,standardsMappings.id,relatedConcepts.type,relatedConcepts.label,relatedConcepts.url,relatedConcepts.source,relatedConcepts.id,univarStats.median,univarStats.mean,univarStats.std,univarStats.min,univarStats.max,univarStats.mode,univarStats.count,univarStats.twentyFifthPercentile,univarStats.seventyFifthPercentile,univarStats.categoricalMarginals.name,univarStats.categoricalMarginals.count -Enrollment,participant_id,Participant Id,Unique identifier for participant,string,,,,[A-Z][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9],,,,,,,,,,,,,,,,,,,,,,,,,,,,, -Demographics,race,Race,Self-reported race,integer,,,1|2|3|4|5|6|7|8,,,,1=White|2=Black or African American|3=American Indian or Alaska Native|4=Native| 5=Hawaiian or Other Pacific Islander|6=Asian|7=Some other race|8=Multiracial|99=Not reported,,99,,,,cde|cde,NLM race,,NLM|NLM,Fakc6Jy2x|m1_atF7L7U,,,,,,,,,,,,,,,, -Demographics,age,Age,What is your age? (age at enrollment),integer,,,,,90,0,,,,,,,,,,,,,,,,,,,,,,,,,,, -Demographics,hispanic,"Hispanic, Latino, or Spanish Origin","Are you of Hispanic, Latino, or Spanish origin?",boolean,,,,,,,,,Not reported,No,Yes,,,,,,,,,,,,,,,,,,,,,, -Demographics,sex_at_birth,Sex at Birth,The self-reported sex of the participant/subject at birth,string,,,Male|Female|Intersex|None of these describe me|Prefer not to answer|Unknown,,,,,,Prefer not to answer|Unknown,,,,,,,,,,,,,,,,,,,,,,,, -Substance Use,SU4,Heroin Days Used,During the past 30 days how many days did you use heroin (alone or mixed with other drugs)? ] [Write 0 days if no use],integer,,,,,,,,,,,,,,,,,,ontology|ontology,,https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808|http://purl.bioontology.org/ontology/RXNORM/3304,CHEBI|RXNORM,27808|3304,,,,,,,,,,, -Biomeasures,pulse_rate,Pulse Rate,Heart rate measured at systemic artery,number,,,,,,,,,,,,,,,,,,ontology,SNOMEDCT bioontology,http://purl.bioontology.org/ontology/SNOMEDCT/78564009,SNOMEDCT,78564009,,,,,,,,,,, +section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues +Enrollment,participant_id,Participant Id,Unique identifier for participant,string,,,,[A-Z][0-9][0-9][0-9]-[0-9][0-9][0-9][0-9],,,,,,, +Demographics,race,Race,Self-reported race,integer,,,1|2|3|4|5|6|7|8,,,,1=White|2=Black or African American|3=American Indian or Alaska Native|4=Native| 5=Hawaiian or Other Pacific Islander|6=Asian|7=Some other race|8=Multiracial|99=Not reported,,99,, +Demographics,age,Age,What is your age? (age at enrollment),integer,,,,,90,0,,,,, +Demographics,hispanic,"Hispanic, Latino, or Spanish Origin","Are you of Hispanic, Latino, or Spanish origin?",boolean,,,,,,,,,Not reported,No,Yes +Demographics,sex_at_birth,Sex at Birth,The self-reported sex of the participant/subject at birth,string,,,Male|Female|Intersex|None of these describe me|Prefer not to answer|Unknown,,,,,,Prefer not to answer|Unknown,, +Substance Use,SU4,Heroin Days Used,During the past 30 days how many days did you use heroin (alone or mixed with other drugs)? ] [Write 0 days if no use],integer,,,,,,,,,,, +Biomeasures,pulse_rate,Pulse Rate,Heart rate measured at systemic artery,number,,,,,,,,,,, diff --git a/variable-level-metadata-schema/examples/valid/template_submission_minimal.csv b/variable-level-metadata-schema/examples/valid/template_submission_minimal.csv index e7fc476..2d5175f 100644 --- a/variable-level-metadata-schema/examples/valid/template_submission_minimal.csv +++ b/variable-level-metadata-schema/examples/valid/template_submission_minimal.csv @@ -1,8 +1,8 @@ -section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,encodings,ordered,missingValues,trueValues,falseValues,repo_link,standardsMappings.type,standardsMappings.label,standardsMappings.url,standardsMappings.source,standardsMappings.id,relatedConcepts.type,relatedConcepts.label,relatedConcepts.url,relatedConcepts.source,relatedConcepts.id,univarStats.median,univarStats.mean,univarStats.std,univarStats.min,univarStats.max,univarStats.mode,univarStats.count,univarStats.twentyFifthPercentile,univarStats.seventyFifthPercentile,univarStats.categoricalMarginals.name,univarStats.categoricalMarginals.count -,participant_id,,Unique identifier for participant,string,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, -,race,,Self-reported race,integer,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, -,age,,What is your age? (age at enrollment),integer,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, -,hispanic,,"Are you of Hispanic, Latino, or Spanish origin?",boolean,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, -,sex_at_birth,,The self-reported sex of the participant/subject at birth,string,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, -,SU4,,During the past 30 days how many days did you use heroin (alone or mixed with other drugs)? ] [Write 0 days if no use],integer,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, -,pulse_rate,,Heart rate measured at systemic artery,number,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, +section,name,title,description,type +Enrollment,participant_id,Participant id,Unique identifier for participant,string +Demographics,race,Race,Self-reported race,integer +Demographics,age,Age,What is your age? (age at enrollment),integer +Demographics,hispanic,Hispanic,"Are you of Hispanic, Latino, or Spanish origin?",boolean +Demographics,sex_at_birth,Sex at Birth,The self-reported sex of the participant/subject at birth,string +Substance Use,SU4,Heroin Days Used,During the past 30 days how many days did you use heroin (alone or mixed with other drugs)? ] [Write 0 days if no use],integer +Biomeasures,pulse_rate,Pulse rate,Heart rate measured at systemic artery,number diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index b51777b..2d721a4 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -63,18 +63,18 @@ "type": "string", "constraints": { "enum": [ - "string", - "year", - "yearmonth", - "date", - "datetime", - "integer", "boolean", + "any", "number", + "date", + "time", + "yearmonth", "duration", "geopoint", - "time", - "any" + "year", + "integer", + "datetime", + "string" ] } }, @@ -181,6 +181,7 @@ }, { "name": "standardsMappings[0].instrument.url", + "description": "A url (e.g., link, address) to a file or other resource containing the instrument, or\na set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n", "title": "Url", "examples": [ "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx" @@ -189,6 +190,8 @@ }, { "name": "standardsMappings[0].instrument.source", + "description": "An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)\ncontaining the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n", + "title": "Source", "type": "string", "constraints": { "enum": [ @@ -198,6 +201,7 @@ }, { "name": "standardsMappings[0].instrument.title", + "title": "Title", "examples": [ "Adult demographics", "adult-demographics" @@ -206,8 +210,10 @@ }, { "name": "standardsMappings[0].instrument.id", + "description": "A code or other string that identifies the instrument within the source.\nThis should always be from the source's formal, standardized identification system \n", + "title": "Identifier", "examples": [ - "1020" + "5141" ], "type": "string" }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index 0043666..60af21f 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -154,7 +154,7 @@ }, "standardsMappings[0].instrument.url": { "title": "Url", - "description": "", + "description": "A url (e.g., link, address) to a file or other resource containing the instrument, or\na set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n", "type": "string", "format": "uri", "examples": [ @@ -163,12 +163,15 @@ }, "standardsMappings[0].instrument.source": { "type": "string", + "title": "Source", + "description": "An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)\ncontaining the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n", "enum": [ "heal-cde" ] }, "standardsMappings[0].instrument.title": { "type": "string", + "title": "Title", "examples": [ "Adult demographics", "adult-demographics" @@ -176,8 +179,10 @@ }, "standardsMappings[0].instrument.id": { "type": "string", + "title": "Identifier", + "description": "A code or other string that identifies the instrument within the source.\nThis should always be from the source's formal, standardized identification system \n", "examples": [ - "1020" + "5141" ] }, "standardsMappings[0].item.url": { diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 083b7c7..6f7ff42 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -38,7 +38,7 @@ "properties": { "url": { "title": "Url", - "description": "", + "description": "A url (e.g., link, address) to a file or other resource containing the instrument, or\na set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n", "type": "string", "format": "uri", "examples": [ @@ -47,12 +47,15 @@ }, "source": { "type": "string", + "title": "Source", + "description": "An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)\ncontaining the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n", "enum": [ "heal-cde" ] }, "title": { "type": "string", + "title": "Title", "examples": [ "Adult demographics", "adult-demographics" @@ -60,8 +63,10 @@ }, "id": { "type": "string", + "title": "Identifier", + "description": "A code or other string that identifies the instrument within the source.\nThis should always be from the source's formal, standardized identification system \n", "examples": [ - "1020" + "5141" ] } } @@ -260,7 +265,7 @@ }, "standardsMappings": { "type": "array", - "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \"1020\"\n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"Adult demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \"1020\"\n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"Adult demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", + "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \"5141\"\n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"Adult demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \"5141\"\n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"Adult demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", "items": { "type": "object", "properties": { @@ -271,7 +276,7 @@ "properties": { "url": { "title": "Url", - "description": "", + "description": "A url (e.g., link, address) to a file or other resource containing the instrument, or\na set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n", "type": "string", "format": "uri", "examples": [ @@ -280,12 +285,15 @@ }, "source": { "type": "string", + "title": "Source", + "description": "An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)\ncontaining the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n", "enum": [ "heal-cde" ] }, "title": { "type": "string", + "title": "Title", "examples": [ "Adult demographics", "adult-demographics" @@ -293,8 +301,10 @@ }, "id": { "type": "string", + "title": "Identifier", + "description": "A code or other string that identifies the instrument within the source.\nThis should always be from the source's formal, standardized identification system \n", "examples": [ - "1020" + "5141" ] } } From 0fbe8f1030a74515522f5eb9c9fdedc953455048 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Fri, 5 Jan 2024 18:04:01 -0600 Subject: [PATCH 46/72] Update README rules --- variable-level-metadata-schema/README.md | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index abe1a97..17dca71 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -173,19 +173,29 @@ To facilitate the mapping of json spec property names to csv property names, th - following from (1), an `enum` must only contain values of the same type - (at least currently) MUST contain only types supported by csv fields which include scalar types (e.g., `boolean`,`string`,`integer`,`number`) in addition to type `object` as this has a stringified representation (see above). -### csv to json cascading +### csv to json and json to csv translations -- If the same value/instance of a property exists at the field level for ALL records (only one unique value) AND this same property is specified in the root level of json specification (after translation of above csv to json property rules), then this unique value will be added to the json root level property BEFORE the translated json document is validated. +There are two rules for conversion from json to csv (or csv to json) specs: -This provides a way to specify root level properties within vlmd csv documents for a few use cases: +1. __csv spec field-level property and json spec root-level property match__: If -- in the json schema spec version -- a property is specified at the root-level AND this same property is specified in the field level of the json spec schema + - csv to json: If the same value/instance of a property exists at the field level for ALL records (only one unique value but no missing values) then this unique value -- when translated to the json spec version -- will be moved to the root level data dictionary + - json to csv: All root level properties will be moved to individual field properties BUT field level properties that exist take precedence. + +More concretely, this provides a way to specify root level properties within vlmd csv documents for a few use cases but can generalize to other future additional property matches: 1. specifying the schema version that represents the vlmd document (`schemaVersion`) 2. specifying other data dictionary level properties such as `standardsMappings[0].instrument` +### root and field property cascading pattern +Akin to the above json to csv, more generally: + +All root level properties will be applied to individual fields IF this same field level property is not specified (i.e., field-level takes precedence). This strategy can be seen in the [data package standard (but with missingValues)](https://specs.frictionlessdata.io/patterns/#missing-values-per-field) + + ### csv and json vlmd document file naming File names for json and csv translations of a vlmd document SHOULD -have the same stem name (eg `my-heal-dd.csv` and `my-heal-dd.json`) +have the same stem name with corresponding "csv" and "json" suffixes (eg `my-heal-dd.csv` and `my-heal-dd.json`) ## Considerations From 3d5155b1cd9f3a14f881fce369978d8092b1cb26 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Wed, 10 Jan 2024 20:06:19 -0600 Subject: [PATCH 47/72] Added schemaVersion to fields (and made schemaVersion a defnition) --- .../schemas/dictionary/data-dictionary.yaml | 13 +------------ .../schemas/dictionary/definitions.yaml | 18 +++++++++++++++++- .../schemas/dictionary/fields.yaml | 2 ++ 3 files changed, 20 insertions(+), 13 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index c0046d6..42a17e6 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -13,19 +13,8 @@ properties: description: type: string schemaVersion: - type: string - description: | - The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) - - NOTE: This is NOT for versioning of each indiviual data dictionary instance. - Rather, it is the - version of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance - version. + $ref: "#/definitions/schemaVersion" - If generating a vlmd document as a csv file, include this version in - every row/record to indicate this is a schema level property - (not applicable for the json version as this property is already at the schema/root level) - pattern: \d+\.\d+\.\d+ version: # TODO: think about having a version text/message and id (akin to a git commit) type: string description: The specified individual data dictionary instance version. diff --git a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml index b005dea..4d1b535 100644 --- a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml @@ -6,7 +6,23 @@ csvObject: type: string pattern: ^(?:.*?=.*?(?:\||$))+$ - +schemaVersion: + type: string + description: | + The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) + + NOTE: This is NOT for versioning of each indiviual data dictionary instance. + Rather, it is the + version of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance + version. + + If generating a vlmd document as a csv file, include this version in + every row/record to indicate this is a schema level property + (not applicable for the json version as this property is already at the schema/root level) + pattern: \d+\.\d+\.\d+ + examples: + - "1.0.0" + - "0.2.0" standardsMappingsInstrumentObject: type: object title: Standard mapping - instrument diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 1908d5d..b287400 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -14,6 +14,8 @@ required: - name - description properties: + schemaVersion: + $ref: "#/definitions/schemaVersion" section: type: string title: Section From bf7543f82b66d14b555382d577dc405e4e75347c Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Thu, 11 Jan 2024 12:20:46 -0600 Subject: [PATCH 48/72] Added contraints.required --- .../schemas/dictionary/fields.yaml | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index b287400..4c52134 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -129,6 +129,13 @@ properties: constraints: type: object properties: + required: + type: boolean + title: Required variable + description: | + If this variable is marked as true, then this variable's value must be present + (ie not missing; see missingValues). If marked as false or not present, then the + variable CAN be missing. maxLength: type: integer title: Maximum Length From d863b1796833a38ecbef291affab6c5e10241b0f Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Thu, 11 Jan 2024 12:21:37 -0600 Subject: [PATCH 49/72] Update build --- .../jsonschema-csvtemplate-fields.html | 8 +++-- ...onschema-jsontemplate-data-dictionary.html | 12 ++++--- .../jsonschema-csvtemplate-fields.md | 18 +++++++++++ ...jsonschema-jsontemplate-data-dictionary.md | 32 +++++++++++++++++++ .../frictionless/csvtemplate/fields.json | 20 +++++++++--- .../jsonschema/csvtemplate/fields.json | 11 ++++++- .../schemas/jsonschema/data-dictionary.json | 20 +++++++++++- .../templates/template_submission.csv | 2 +- .../templates/template_submission.json | 2 ++ 9 files changed, 110 insertions(+), 15 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index b3e84ee..3b55cb3 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -1,11 +1,13 @@ - HEAL Variable Level Metadata Fields

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
+ HEAL Variable Level Metadata Fields 

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
+
"0.2.0"
+

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
 
"Medical History"
 

Type: string

The name of a variable (i.e., field) as it appears in the data.


Example:

"gender_id"
 

Type: string

The human-readable title or label of the variable.


Example:

"Gender identity"
 

Type: string

An extended description of the variable. This could be the definition of a variable or the
question text (e.g., if a survey).


Examples:

"The participant's age at the time of study enrollment"
 
"What is the highest grade or level of school you have completed or the highest degree you have received?"
-

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
See here
for more information about appropriate format values by variable type.

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: string

Constrains possible values to a set of values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"1|2|3|4|5"
+

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
See here
for more information about appropriate format values by variable type.

Type: boolean

If this variable is marked as true, then this variable's value must be present
(ie not missing; see missingValues). If marked as false or not present, then the
variable CAN be missing.

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: string

Constrains possible values to a set of values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"1|2|3|4|5"
 
"Poor|Fair|Good|Very good|Excellent"
 

Type: string

A regular expression pattern the data MUST conform to.

Type: integer

Specifies the maximum value of a field (e.g., maximum -- or most
recent -- date, maximum integer etc). Note, this is different then
maxLength property.

Type: integer

Specifies the minimum value of a field.

Type: string

Variable value encodings provide a way to further annotate any value within a any variable type,
making values easier to understand.

Many analytic software programs (e.g., SPSS,Stata, and SAS) use numerical encodings and some algorithms
only support numerical values. Encodings (and mappings) allow categorical values to be stored as
numerical values.

Additionally, as another use case, this field provides a way to
store categoricals that are stored as "short" labels (such as
abbreviations).

This field is intended to follow this pattern

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
Examples:

"1=Poor|2=Fair|3=Good|4=Very good|5=Excellent"
 
"HW=Hello world|GBW=Good bye world|HM=Hi, Mike"
@@ -23,4 +25,4 @@
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source. Note, the standardsMapping[\d+].source property is required if
this property is specified.


Example:

"C74457"
 

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 9c1884d..fe90201 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -1,15 +1,19 @@ - Variable Level Metadata (Data Dictionaries)

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+

Type: string

The specified individual data dictionary instance version.

Type: array

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: stringFormat: uri

A url (e.g., link, address) to a file or other resource containing the instrument, or
a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).


Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
+ Variable Level Metadata (Data Dictionaries) 

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
+
"0.2.0"
+

Type: string

The specified individual data dictionary instance version.

Type: array

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: stringFormat: uri

A url (e.g., link, address) to a file or other resource containing the instrument, or
a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).


Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
 

Type: enum (of string)

An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)
containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
 
"adult-demographics"
 

Type: string

A code or other string that identifies the instrument within the source.
This should always be from the source's formal, standardized identification system


Example:

"5141"
-

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
+

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
+
"0.2.0"
+

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
 
"Medical History"
 

Type: string

The name of a variable (i.e., field) as it appears in the data.


Example:

"gender_id"
 

Type: string

The human-readable title or label of the variable.


Example:

"Gender identity"
 

Type: string

An extended description of the variable. This could be the definition of a variable or the
question text (e.g., if a survey).


Examples:

"The participant's age at the time of study enrollment"
 
"What is the highest grade or level of school you have completed or the highest degree you have received?"
-

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
See here
for more information about appropriate format values by variable type.

Type: object

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: array

Constrains possible values to a set of values.


Examples:

[
+

Type: enum (of string)

A classification or category of a particular data element or property expected or allowed in the dataset.

Must be one of:

  • "number"
  • "integer"
  • "string"
  • "any"
  • "boolean"
  • "date"
  • "datetime"
  • "time"
  • "year"
  • "yearmonth"
  • "duration"
  • "geopoint"

Type: string

Indicates the format of the type specified in the type property.
Each format is dependent on the type specified.
See here
for more information about appropriate format values by variable type.

Type: object

Type: boolean

If this variable is marked as true, then this variable's value must be present
(ie not missing; see missingValues). If marked as false or not present, then the
variable CAN be missing.

Type: integer

Indicates the maximum length of an iterable (e.g., array, string, or
object). For example, if 'Hello World' is the longest value of a
categorical variable, this would be a maxLength of 11.

Type: array

Constrains possible values to a set of values.


Examples:

[
     1,
     2,
     3,
@@ -123,4 +127,4 @@
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source. Note, the standardsMapping[\d+].source property is required if
this property is specified.


Example:

"C74457"
 

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index 3d41c24..70afa2a 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -29,6 +29,18 @@ If generating a vlmd document as a csv file, include this version in every row/record to indicate this is a schema level property (not applicable for the json version as this property is already at the schema/root level) +Examples: + + +``` + 1.0.0 + +``` + +``` + 0.2.0 + +``` **`section`** _(string)_ The section, form, survey instrument, set of measures or other broad category used @@ -103,6 +115,12 @@ See [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) for more information about appropriate `format` values by variable `type`. +**`constraints.required`** _(boolean)_ + If this variable is marked as true, then this variable's value must be present +(ie not missing; see missingValues). If marked as false or not present, then the +variable CAN be missing. + + **`constraints.maxLength`** _(integer)_ Indicates the maximum length of an iterable (e.g., array, string, or object). For example, if 'Hello World' is the longest value of a diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index f304967..df751f3 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -49,6 +49,31 @@ metadata object within the HEAL platform metadata service. ### Properties for each `fields` record +**`schemaVersion`** _(string)_ + The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) + +NOTE: This is NOT for versioning of each indiviual data dictionary instance. +Rather, it is the +version of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance +version. + +If generating a vlmd document as a csv file, include this version in +every row/record to indicate this is a schema level property +(not applicable for the json version as this property is already at the schema/root level) + +Examples: + + +``` + 1.0.0 + +``` + +``` + 0.2.0 + +``` + **`section`** _(string)_ The section, form, survey instrument, set of measures or other broad category used to group variables. Previously called "module." @@ -126,6 +151,13 @@ for more information about appropriate `format` values by variable `type`. +- **`required`** _(boolean)_ + If this variable is marked as true, then this variable's value must be present + (ie not missing; see missingValues). If marked as false or not present, then the + variable CAN be missing. + + + - **`maxLength`** _(integer)_ Indicates the maximum length of an iterable (e.g., array, string, or object). For example, if 'Hello World' is the longest value of a diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 2d721a4..901a715 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -6,6 +6,10 @@ { "name": "schemaVersion", "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n\nIf generating a vlmd document as a csv file, include this version in \nevery row/record to indicate this is a schema level property \n(not applicable for the json version as this property is already at the schema/root level)\n", + "examples": [ + "1.0.0", + "0.2.0" + ], "type": "string", "constraints": { "pattern": "\\d+\\.\\d+\\.\\d+" @@ -63,17 +67,17 @@ "type": "string", "constraints": { "enum": [ + "geopoint", + "year", "boolean", + "yearmonth", "any", - "number", "date", "time", - "yearmonth", "duration", - "geopoint", - "year", - "integer", "datetime", + "integer", + "number", "string" ] } @@ -84,6 +88,12 @@ "title": "Variable Format", "type": "string" }, + { + "name": "constraints.required", + "description": "If this variable is marked as true, then this variable's value must be present\n(ie not missing; see missingValues). If marked as false or not present, then the \nvariable CAN be missing.\n", + "title": "Required variable", + "type": "boolean" + }, { "name": "constraints.maxLength", "description": "Indicates the maximum length of an iterable (e.g., array, string, or\nobject). For example, if 'Hello World' is the longest value of a\ncategorical variable, this would be a maxLength of 11.\n", diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index 60af21f..e929840 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -11,7 +11,11 @@ "schemaVersion": { "type": "string", "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n\nIf generating a vlmd document as a csv file, include this version in \nevery row/record to indicate this is a schema level property \n(not applicable for the json version as this property is already at the schema/root level)\n", - "pattern": "\\d+\\.\\d+\\.\\d+" + "pattern": "\\d+\\.\\d+\\.\\d+", + "examples": [ + "1.0.0", + "0.2.0" + ] }, "section": { "type": "string", @@ -74,6 +78,11 @@ "description": "Indicates the format of the type specified in the `type` property. \nEach format is dependent on the `type` specified. \nSee [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) \nfor more information about appropriate `format` values by variable `type`.\n", "additionalDescription": "examples/definitions of patterns and possible values:\n\nExamples of date time pattern formats\n\n- \"`%Y-%m-%d` (for date, e.g., 2023-05-25)\"\n- \"`%Y%-%d` (for date, e.g., 20230525) for date without dashes\"\n- \"`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)\"\n- \"`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)\"\n- \"`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)\"\n- \"`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)\"\n- \"`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)\"\n- \"`%H:%M:%S` (for time, e.g., 10:30:45)\"\n- \"`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)\"\n- \"`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)\"\n\nExamples of string formats\n\n- \"`email` if valid emails (e.g., test@gmail.com)\"\n- \"`uri` if valid uri addresses (e.g., https://example.com/resource123)\"\n- \"`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)\"\n- \"`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)\"\n\n\nExamples of geopoint formats\n\nThe two types of formats for `geopoint` (describing a geographic point).\n\n- `array` (if 'lat,long' (e.g., 36.63,-90.20))\n- `object` (if {'lat':36.63,'lon':-90.20})\n" }, + "constraints.required": { + "type": "boolean", + "title": "Required variable", + "description": "If this variable is marked as true, then this variable's value must be present\n(ie not missing; see missingValues). If marked as false or not present, then the \nvariable CAN be missing.\n" + }, "constraints.maxLength": { "type": "integer", "title": "Maximum Length", diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 6f7ff42..6d28d50 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -19,7 +19,11 @@ "schemaVersion": { "type": "string", "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n\nIf generating a vlmd document as a csv file, include this version in \nevery row/record to indicate this is a schema level property \n(not applicable for the json version as this property is already at the schema/root level)\n", - "pattern": "\\d+\\.\\d+\\.\\d+" + "pattern": "\\d+\\.\\d+\\.\\d+", + "examples": [ + "1.0.0", + "0.2.0" + ] }, "version": { "type": "string", @@ -85,6 +89,15 @@ "description" ], "properties": { + "schemaVersion": { + "type": "string", + "description": "The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2) \n\nNOTE: This is NOT for versioning of each indiviual data dictionary instance. \nRather, it is the\nversion of THIS schema document. See `version` property (below) if specifying the individual data dictionary instance\nversion.\n\nIf generating a vlmd document as a csv file, include this version in \nevery row/record to indicate this is a schema level property \n(not applicable for the json version as this property is already at the schema/root level)\n", + "pattern": "\\d+\\.\\d+\\.\\d+", + "examples": [ + "1.0.0", + "0.2.0" + ] + }, "section": { "type": "string", "title": "Section", @@ -149,6 +162,11 @@ "constraints": { "type": "object", "properties": { + "required": { + "type": "boolean", + "title": "Required variable", + "description": "If this variable is marked as true, then this variable's value must be present\n(ie not missing; see missingValues). If marked as false or not present, then the \nvariable CAN be missing.\n" + }, "maxLength": { "type": "integer", "title": "Maximum Length", diff --git a/variable-level-metadata-schema/templates/template_submission.csv b/variable-level-metadata-schema/templates/template_submission.csv index 62d26d1..10d8f50 100644 --- a/variable-level-metadata-schema/templates/template_submission.csv +++ b/variable-level-metadata-schema/templates/template_submission.csv @@ -1 +1 @@ -schemaVersion,section,name,title,description,type,format,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id,relatedConcepts[0].url,relatedConcepts[0].type,relatedConcepts[0].label,relatedConcepts[0].source,relatedConcepts[0].id \ No newline at end of file +schemaVersion,section,name,title,description,type,format,constraints.required,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id,relatedConcepts[0].url,relatedConcepts[0].type,relatedConcepts[0].label,relatedConcepts[0].source,relatedConcepts[0].id \ No newline at end of file diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index 2de1e47..bd0d36b 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -16,6 +16,7 @@ ], "fields": [ { + "schemaVersion": null, "section": null, "name": null, "title": null, @@ -23,6 +24,7 @@ "type": null, "format": null, "constraints": { + "required": null, "maxLength": null, "enum": [], "pattern": null, From 9fd743dd89318757f523040a23529a1ce11a57b1 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 16 Jan 2024 09:55:43 -0600 Subject: [PATCH 50/72] Update annotations --- .../schemas/dictionary/definitions.yaml | 11 ++++---- .../schemas/dictionary/fields.yaml | 28 +++++++++---------- 2 files changed, 20 insertions(+), 19 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml index 4d1b535..1115c45 100644 --- a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/definitions.yaml @@ -199,12 +199,12 @@ fieldStandardsMappingsItem: item: type: object - title: Standard mapping - item + title: Standards mappings - Item description: | A standardized item (ie field, variable etc) mapped to this individual variable. properties: url: - title: Standards Mapping - Url + title: Standards mappings - Url description: | The url that links out to the published, standardized mapping of a variable (e.g., common data element) type: string @@ -212,7 +212,7 @@ fieldStandardsMappingsItem: examples: - "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE" source: - title: Standard Mapping - Source + title: Standards mappings - Source description: | The source of the standardized variable. Note, this property is required if an id is specified. @@ -220,10 +220,11 @@ fieldStandardsMappingsItem: - "CDISC" type: string id: - title: Standard Mapping - Id + title: Standards Mappings - Id type: string description: | - The id locating the individual mapping within the given source. Note, the `standardsMapping[\d+].source` property is required if + The id locating the individual mapping within the given source. + Note, the `standardsMappings[0].source` property is required if this property is specified. examples: - "C74457" diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 4c52134..051372f 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -100,23 +100,23 @@ properties: Examples of date time pattern formats - - "`%Y-%m-%d` (for date, e.g., 2023-05-25)" - - "`%Y%-%d` (for date, e.g., 20230525) for date without dashes" - - "`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)" - - "`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)" - - "`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)" - - "`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)" - - "`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)" - - "`%H:%M:%S` (for time, e.g., 10:30:45)" - - "`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)" - - "`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)" + - `%Y-%m-%d` (for date, e.g., 2023-05-25) + - `%Y%-%d` (for date, e.g., 20230525) for date without dashes + - `%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45) + - `%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z) + - `%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300) + - `%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30) + - `%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10) + - `%H:%M:%S` (for time, e.g., 10:30:45) + - `%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z) + - `%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300) Examples of string formats - - "`email` if valid emails (e.g., test@gmail.com)" - - "`uri` if valid uri addresses (e.g., https://example.com/resource123)" - - "`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)" - - "`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)" + - `email` if valid emails (e.g., test@gmail.com) + - `uri` if valid uri addresses (e.g., https://example.com/resource123) + - `binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=) + - `uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479) Examples of geopoint formats From 6edf2f6c78d825f88cc6b9bfc6214c75997d2234 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 16 Jan 2024 09:56:20 -0600 Subject: [PATCH 51/72] Update build --- .../jsonschema-csvtemplate-fields.html | 4 +-- ...onschema-jsontemplate-data-dictionary.html | 4 +-- .../jsonschema-csvtemplate-fields.md | 31 ++++++++++--------- ...jsonschema-jsontemplate-data-dictionary.md | 28 ++++++++--------- .../frictionless/csvtemplate/fields.json | 24 +++++++------- .../jsonschema/csvtemplate/fields.json | 10 +++--- .../schemas/jsonschema/data-dictionary.json | 12 +++---- 7 files changed, 57 insertions(+), 56 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 3b55cb3..53a4056 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -23,6 +23,6 @@

Type: string

A code or other string that identifies the instrument within the source.
This should always be from the source's formal, standardized identification system


Example:

"5141"
 

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
-

Type: string

The id locating the individual mapping within the given source. Note, the standardsMapping[\d+].source property is required if
this property is specified.


Example:

"C74457"
+

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
 

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index fe90201..2d38aed 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -125,6 +125,6 @@

Type: string

A code or other string that identifies the instrument within the source.
This should always be from the source's formal, standardized identification system


Example:

"5141"
 

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
-

Type: string

The id locating the individual mapping within the given source. Note, the standardsMapping[\d+].source property is required if
this property is specified.


Example:

"C74457"
+

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
 

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index 70afa2a..dd851e3 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -317,7 +317,8 @@ Examples: ``` **`standardsMappings[0].item.id`** _(string)_ - The id locating the individual mapping within the given source. Note, the `standardsMapping[\d+].source` property is required if + The id locating the individual mapping within the given source. +Note, the `standardsMappings[0].source` property is required if this property is specified. Examples: @@ -378,23 +379,23 @@ Examples: Examples of date time pattern formats -- "`%Y-%m-%d` (for date, e.g., 2023-05-25)" -- "`%Y%-%d` (for date, e.g., 20230525) for date without dashes" -- "`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)" -- "`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)" -- "`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)" -- "`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)" -- "`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)" -- "`%H:%M:%S` (for time, e.g., 10:30:45)" -- "`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)" -- "`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)" +- `%Y-%m-%d` (for date, e.g., 2023-05-25) +- `%Y%-%d` (for date, e.g., 20230525) for date without dashes +- `%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45) +- `%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z) +- `%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300) +- `%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30) +- `%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10) +- `%H:%M:%S` (for time, e.g., 10:30:45) +- `%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z) +- `%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300) Examples of string formats -- "`email` if valid emails (e.g., test@gmail.com)" -- "`uri` if valid uri addresses (e.g., https://example.com/resource123)" -- "`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)" -- "`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)" +- `email` if valid emails (e.g., test@gmail.com) +- `uri` if valid uri addresses (e.g., https://example.com/resource123) +- `binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=) +- `uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479) Examples of geopoint formats diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index df751f3..3943ae8 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -421,23 +421,23 @@ ontological information (eg., NCI thesaurus, bioportal etc) Examples of date time pattern formats -- "`%Y-%m-%d` (for date, e.g., 2023-05-25)" -- "`%Y%-%d` (for date, e.g., 20230525) for date without dashes" -- "`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)" -- "`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)" -- "`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)" -- "`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)" -- "`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)" -- "`%H:%M:%S` (for time, e.g., 10:30:45)" -- "`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)" -- "`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)" +- `%Y-%m-%d` (for date, e.g., 2023-05-25) +- `%Y%-%d` (for date, e.g., 20230525) for date without dashes +- `%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45) +- `%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z) +- `%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300) +- `%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30) +- `%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10) +- `%H:%M:%S` (for time, e.g., 10:30:45) +- `%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z) +- `%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300) Examples of string formats -- "`email` if valid emails (e.g., test@gmail.com)" -- "`uri` if valid uri addresses (e.g., https://example.com/resource123)" -- "`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)" -- "`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)" +- `email` if valid emails (e.g., test@gmail.com) +- `uri` if valid uri addresses (e.g., https://example.com/resource123) +- `binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=) +- `uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479) Examples of geopoint formats diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 901a715..d154868 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -67,18 +67,18 @@ "type": "string", "constraints": { "enum": [ - "geopoint", - "year", - "boolean", - "yearmonth", + "string", "any", - "date", - "time", "duration", - "datetime", "integer", + "datetime", + "time", + "boolean", + "yearmonth", "number", - "string" + "year", + "date", + "geopoint" ] } }, @@ -230,7 +230,7 @@ { "name": "standardsMappings[0].item.url", "description": "The url that links out to the published, standardized mapping of a variable (e.g., common data element)\n", - "title": "Standards Mapping - Url", + "title": "Standards mappings - Url", "examples": [ "https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE" ], @@ -239,7 +239,7 @@ { "name": "standardsMappings[0].item.source", "description": "The source of the standardized variable. Note, this property is required if \nan id is specified.\n", - "title": "Standard Mapping - Source", + "title": "Standards mappings - Source", "examples": [ "CDISC" ], @@ -247,8 +247,8 @@ }, { "name": "standardsMappings[0].item.id", - "description": "The id locating the individual mapping within the given source. Note, the `standardsMapping[\\d+].source` property is required if \nthis property is specified.\n", - "title": "Standard Mapping - Id", + "description": "The id locating the individual mapping within the given source. \nNote, the `standardsMappings[0].source` property is required if \nthis property is specified.\n", + "title": "Standards Mappings - Id", "examples": [ "C74457" ], diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index e929840..a7e3369 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -76,7 +76,7 @@ "title": "Variable Format", "type": "string", "description": "Indicates the format of the type specified in the `type` property. \nEach format is dependent on the `type` specified. \nSee [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) \nfor more information about appropriate `format` values by variable `type`.\n", - "additionalDescription": "examples/definitions of patterns and possible values:\n\nExamples of date time pattern formats\n\n- \"`%Y-%m-%d` (for date, e.g., 2023-05-25)\"\n- \"`%Y%-%d` (for date, e.g., 20230525) for date without dashes\"\n- \"`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)\"\n- \"`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)\"\n- \"`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)\"\n- \"`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)\"\n- \"`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)\"\n- \"`%H:%M:%S` (for time, e.g., 10:30:45)\"\n- \"`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)\"\n- \"`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)\"\n\nExamples of string formats\n\n- \"`email` if valid emails (e.g., test@gmail.com)\"\n- \"`uri` if valid uri addresses (e.g., https://example.com/resource123)\"\n- \"`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)\"\n- \"`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)\"\n\n\nExamples of geopoint formats\n\nThe two types of formats for `geopoint` (describing a geographic point).\n\n- `array` (if 'lat,long' (e.g., 36.63,-90.20))\n- `object` (if {'lat':36.63,'lon':-90.20})\n" + "additionalDescription": "examples/definitions of patterns and possible values:\n\nExamples of date time pattern formats\n\n- `%Y-%m-%d` (for date, e.g., 2023-05-25)\n- `%Y%-%d` (for date, e.g., 20230525) for date without dashes\n- `%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)\n- `%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)\n- `%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)\n- `%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)\n- `%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)\n- `%H:%M:%S` (for time, e.g., 10:30:45)\n- `%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)\n- `%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)\n\nExamples of string formats\n\n- `email` if valid emails (e.g., test@gmail.com)\n- `uri` if valid uri addresses (e.g., https://example.com/resource123)\n- `binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)\n- `uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)\n\n\nExamples of geopoint formats\n\nThe two types of formats for `geopoint` (describing a geographic point).\n\n- `array` (if 'lat,long' (e.g., 36.63,-90.20))\n- `object` (if {'lat':36.63,'lon':-90.20})\n" }, "constraints.required": { "type": "boolean", @@ -195,7 +195,7 @@ ] }, "standardsMappings[0].item.url": { - "title": "Standards Mapping - Url", + "title": "Standards mappings - Url", "description": "The url that links out to the published, standardized mapping of a variable (e.g., common data element)\n", "type": "string", "format": "uri", @@ -204,7 +204,7 @@ ] }, "standardsMappings[0].item.source": { - "title": "Standard Mapping - Source", + "title": "Standards mappings - Source", "description": "The source of the standardized variable. Note, this property is required if \nan id is specified.\n", "examples": [ "CDISC" @@ -212,9 +212,9 @@ "type": "string" }, "standardsMappings[0].item.id": { - "title": "Standard Mapping - Id", + "title": "Standards Mappings - Id", "type": "string", - "description": "The id locating the individual mapping within the given source. Note, the `standardsMapping[\\d+].source` property is required if \nthis property is specified.\n", + "description": "The id locating the individual mapping within the given source. \nNote, the `standardsMappings[0].source` property is required if \nthis property is specified.\n", "examples": [ "C74457" ] diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 6d28d50..71c217b 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -157,7 +157,7 @@ "title": "Variable Format", "type": "string", "description": "Indicates the format of the type specified in the `type` property. \nEach format is dependent on the `type` specified. \nSee [here](https://specs.frictionlessdata.io/table-schema/#types-and-formats) \nfor more information about appropriate `format` values by variable `type`.\n", - "additionalDescription": "examples/definitions of patterns and possible values:\n\nExamples of date time pattern formats\n\n- \"`%Y-%m-%d` (for date, e.g., 2023-05-25)\"\n- \"`%Y%-%d` (for date, e.g., 20230525) for date without dashes\"\n- \"`%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)\"\n- \"`%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)\"\n- \"`%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)\"\n- \"`%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)\"\n- \"`%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)\"\n- \"`%H:%M:%S` (for time, e.g., 10:30:45)\"\n- \"`%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)\"\n- \"`%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)\"\n\nExamples of string formats\n\n- \"`email` if valid emails (e.g., test@gmail.com)\"\n- \"`uri` if valid uri addresses (e.g., https://example.com/resource123)\"\n- \"`binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)\"\n- \"`uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)\"\n\n\nExamples of geopoint formats\n\nThe two types of formats for `geopoint` (describing a geographic point).\n\n- `array` (if 'lat,long' (e.g., 36.63,-90.20))\n- `object` (if {'lat':36.63,'lon':-90.20})\n" + "additionalDescription": "examples/definitions of patterns and possible values:\n\nExamples of date time pattern formats\n\n- `%Y-%m-%d` (for date, e.g., 2023-05-25)\n- `%Y%-%d` (for date, e.g., 20230525) for date without dashes\n- `%Y-%m-%dT%H:%M:%S` (for datetime, e.g., 2023-05-25T10:30:45)\n- `%Y-%m-%dT%H:%M:%SZ` (for datetime with UTC timezone, e.g., 2023-05-25T10:30:45Z)\n- `%Y-%m-%dT%H:%M:%S%z` (for datetime with timezone offset, e.g., 2023-05-25T10:30:45+0300)\n- `%Y-%m-%dT%H:%M` (for datetime without seconds, e.g., 2023-05-25T10:30)\n- `%Y-%m-%dT%H` (for datetime without minutes and seconds, e.g., 2023-05-25T10)\n- `%H:%M:%S` (for time, e.g., 10:30:45)\n- `%H:%M:%SZ` (for time with UTC timezone, e.g., 10:30:45Z)\n- `%H:%M:%S%z` (for time with timezone offset, e.g., 10:30:45+0300)\n\nExamples of string formats\n\n- `email` if valid emails (e.g., test@gmail.com)\n- `uri` if valid uri addresses (e.g., https://example.com/resource123)\n- `binary` if a base64 binary encoded string (e.g., authentication token like aGVsbG8gd29ybGQ=)\n- `uuid` if a universal unique identifier also known as a guid (eg., f47ac10b-58cc-4372-a567-0e02b2c3d479)\n\n\nExamples of geopoint formats\n\nThe two types of formats for `geopoint` (describing a geographic point).\n\n- `array` (if 'lat,long' (e.g., 36.63,-90.20))\n- `object` (if {'lat':36.63,'lon':-90.20})\n" }, "constraints": { "type": "object", @@ -329,11 +329,11 @@ }, "item": { "type": "object", - "title": "Standard mapping - item", + "title": "Standards mappings - Item", "description": "A standardized item (ie field, variable etc) mapped to this individual variable.\n", "properties": { "url": { - "title": "Standards Mapping - Url", + "title": "Standards mappings - Url", "description": "The url that links out to the published, standardized mapping of a variable (e.g., common data element)\n", "type": "string", "format": "uri", @@ -342,7 +342,7 @@ ] }, "source": { - "title": "Standard Mapping - Source", + "title": "Standards mappings - Source", "description": "The source of the standardized variable. Note, this property is required if \nan id is specified.\n", "examples": [ "CDISC" @@ -350,9 +350,9 @@ "type": "string" }, "id": { - "title": "Standard Mapping - Id", + "title": "Standards Mappings - Id", "type": "string", - "description": "The id locating the individual mapping within the given source. Note, the `standardsMapping[\\d+].source` property is required if \nthis property is specified.\n", + "description": "The id locating the individual mapping within the given source. \nNote, the `standardsMappings[0].source` property is required if \nthis property is specified.\n", "examples": [ "C74457" ] From 5d107b16bcd540a06ff647523d0b947414c38454 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 16 Jan 2024 10:11:36 -0600 Subject: [PATCH 52/72] Update build with new annotations --- .../docs/assets/templates/csvtemplate.md | 7 +++++++ .../jsonschema-csvtemplate-fields.html | 4 ++-- ...sonschema-jsontemplate-data-dictionary.html | 6 +++--- .../jsonschema-csvtemplate-fields.md | 10 ++++++++-- .../jsonschema-jsontemplate-data-dictionary.md | 5 ++--- .../schemas/dictionary/data-dictionary.yaml | 2 +- .../schemas/dictionary/fields.yaml | 5 ++--- .../frictionless/csvtemplate/fields.json | 18 +++++++++--------- .../schemas/jsonschema/csvtemplate/fields.json | 2 +- .../schemas/jsonschema/data-dictionary.json | 4 ++-- 10 files changed, 37 insertions(+), 26 deletions(-) diff --git a/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md b/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md index aa8bc92..7c2227a 100644 --- a/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md +++ b/variable-level-metadata-schema/docs/assets/templates/csvtemplate.md @@ -2,6 +2,13 @@ _version {{ schema.version }}_ + + +The aim of this HEAL metadata piece is to track and provide basic information about variables in a tabular data file (i.e. a data file with rows and columns) from your HEAL study. The objective is to list all variables and descriptive information about those variables. This will ensure that potential secondary data users know what data has been collected or calculated and how to use these data. Note that a given study can have multiple tabular data files; You should create a data dictionary for each tabular data file. Thus, a study may have multiple data dictionaries. {{ schema.description }} diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 53a4056..90274f7 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -1,4 +1,4 @@ - HEAL Variable Level Metadata Fields

HEAL Variable Level Metadata Fields

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
+ HEAL Variable Level Metadata Fields 

HEAL Variable Level Metadata Fields

Type: object

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
 
"0.2.0"
 

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
@@ -25,4 +25,4 @@
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
 

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 2d38aed..a854600 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -1,10 +1,10 @@ - Variable Level Metadata (Data Dictionaries)

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries

Type: string

Type: string

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
+ Variable Level Metadata (Data Dictionaries) 

Variable Level Metadata (Data Dictionaries)

Type: object

This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries.

Type: string

Type: string

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
 
"0.2.0"
 

Type: string

The specified individual data dictionary instance version.

Type: array

A set of standardized instruments linked to all variables within the fields property (but see note).

!!! note "NOTE"

If standardsMappings is present at both the root (this property) and within fields,
then the fields standardsMappings property takes precedence.

Note, only instrument can be mapped to this property as opposed to the fields standardsMappings
This property has the same specification as the fields standardsMappings to make the cascading logic
easier to understand in the same way other standards implement cascading
(e.g., missingValues in the frictionless specification)

Each item of this array must be:

Type: object

Type: object

A standardized set of items which encompass
a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

!!! note "NOTE"

If information is present at both the root and the field level,
then the information at the field level would take precedence (i.e., it would cascade).

Type: stringFormat: uri

A url (e.g., link, address) to a file or other resource containing the instrument, or
a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).


Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
 

Type: enum (of string)

An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)
containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
 
"adult-demographics"
 

Type: string

A code or other string that identifies the instrument within the source.
This should always be from the source's formal, standardized identification system


Example:

"5141"
-

Type: array of object

Each item of this array must be:

Type: object

Variable level metadata individual fields integrated into the variable level
metadata object within the HEAL platform metadata service.

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
+

Type: array of object

Each item of this array must be:

Type: object

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
 
"0.2.0"
 

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
@@ -127,4 +127,4 @@
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
 

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file +

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index dd851e3..e309337 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -2,9 +2,15 @@ _version 0.2.0_ + + +The aim of this HEAL metadata piece is to track and provide basic information about variables in a tabular data file (i.e. a data file with rows and columns) from your HEAL study. The objective is to list all variables and descriptive information about those variables. This will ensure that potential secondary data users know what data has been collected or calculated and how to use these data. Note that a given study can have multiple tabular data files; You should create a data dictionary for each tabular data file. Thus, a study may have multiple data dictionaries. + -Variable level metadata individual fields integrated into the variable level -metadata object within the HEAL platform metadata service. !!! note "Highly encouraged" diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index 3943ae8..bdb7f87 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -2,7 +2,7 @@ _version 0.2.0_ -This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries +This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries. ## `title` _(string,required)_ @@ -37,8 +37,7 @@ A set of standardized instruments linked to all variables within the `fields` pr ## `fields` _(array,required)_ -Variable level metadata individual fields integrated into the variable level -metadata object within the HEAL platform metadata service. + !!! note "Highly encouraged" diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index 42a17e6..225313f 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -2,7 +2,7 @@ "$id": vlmd title: Variable Level Metadata (Data Dictionaries) description: This schema defines the variable level metadata for one data dictionary - for a given study.Note a given study can have multiple data dictionaries + for a given study.Note a given study can have multiple data dictionaries. type: object required: - title diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 051372f..603654d 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -1,8 +1,7 @@ title: HEAL Variable Level Metadata Fields description: | - Variable level metadata individual fields integrated into the variable level - metadata object within the HEAL platform metadata service. - + + !!! note "Highly encouraged" Only `name` and `description` properties are required. diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index d154868..71d49fd 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -1,6 +1,6 @@ { "version": "0.2.0", - "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", + "description": "\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "title": "HEAL Variable Level Metadata Fields", "fields": [ { @@ -67,18 +67,18 @@ "type": "string", "constraints": { "enum": [ - "string", - "any", - "duration", + "geopoint", + "date", "integer", - "datetime", - "time", "boolean", - "yearmonth", + "duration", + "datetime", "number", + "string", + "time", + "any", "year", - "date", - "geopoint" + "yearmonth" ] } }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index a7e3369..7aeacd8 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -1,7 +1,7 @@ { "version": "0.2.0", "title": "HEAL Variable Level Metadata Fields", - "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", + "description": "\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "type": "object", "required": [ "name", diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 71c217b..31e5e15 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -3,7 +3,7 @@ "$schema": "http://json-schema.org/draft-07/schema#", "$id": "vlmd", "title": "Variable Level Metadata (Data Dictionaries)", - "description": "This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries", + "description": "This schema defines the variable level metadata for one data dictionary for a given study.Note a given study can have multiple data dictionaries.", "type": "object", "required": [ "title", @@ -82,7 +82,7 @@ "type": "array", "items": { "title": "HEAL Variable Level Metadata Fields", - "description": "Variable level metadata individual fields integrated into the variable level\nmetadata object within the HEAL platform metadata service.\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", + "description": "\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "type": "object", "required": [ "name", From 5d420fbd069841331f5cc53f832cee7b04866bef Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 16 Jan 2024 10:14:26 -0600 Subject: [PATCH 53/72] del relatedConcepts (to avoid confusion with standardsMappings for now and the HEAL semantic search) --- .../schemas/dictionary/fields.yaml | 44 +------------------ 1 file changed, 2 insertions(+), 42 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 603654d..1d3b156 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -1,7 +1,7 @@ title: HEAL Variable Level Metadata Fields description: | - + !!! note "Highly encouraged" Only `name` and `description` properties are required. @@ -237,44 +237,4 @@ properties: - ["Not required","NOT REQUIRED"] - ["No"] standardsMappings: - $ref: "#/definitions/fieldStandardsMappingsItem" - relatedConcepts: - title: Related Concepts - description: | - __**[Under development]**__ Mappings to a published set of concepts related to the given field such as - ontological information (eg., NCI thesaurus, bioportal etc) - type: array - items: - type: object - properties: - url: - title: Related Concepts - Url - description: | - The url that links out to the published, standardized concept. - type: string - format: uri - type: - title: Related concepts - Type - description: | - The **type** of mapping to a published set of concepts related to the given field such as - ontological information (eg., NCI thesaurus, bioportal etc) - type: string - label: - type: string - title: Related Concepts - Label - description: | - A free text **label** of mapping to a published set of concepts related to the given field such as - ontological information (eg., NCI thesaurus, bioportal etc) - - source: - title: Related Concepts - Source - description: | - The source of the related concept. - type: string - examples: - - TBD (will have controlled vocabulary) - id: - title: Related Concepts - Id - type: string - description: | - The id locating the individual mapping within the given source. \ No newline at end of file + $ref: "#/definitions/fieldStandardsMappingsItem" \ No newline at end of file From a2581ce97cdafb9ed655fecf167240db57c7be81 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 16 Jan 2024 10:16:10 -0600 Subject: [PATCH 54/72] update build --- .../jsonschema-csvtemplate-fields.html | 3 +- ...onschema-jsontemplate-data-dictionary.html | 3 +- .../jsonschema-csvtemplate-fields.md | 28 ------------ ...jsonschema-jsontemplate-data-dictionary.md | 5 --- .../frictionless/csvtemplate/fields.json | 45 +++---------------- .../jsonschema/csvtemplate/fields.json | 29 ------------ .../schemas/jsonschema/data-dictionary.json | 39 ---------------- .../templates/template_submission.csv | 2 +- .../templates/template_submission.json | 9 ---- 9 files changed, 9 insertions(+), 154 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 90274f7..ab5f5e3 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -24,5 +24,4 @@

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file + \ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index a854600..8b77d36 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -126,5 +126,4 @@

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, standardized concept.

Type: string

The type of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

A free text label of mapping to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Type: string

The source of the related concept.


Example:

"TBD (will have controlled vocabulary)"
-

Type: string

The id locating the individual mapping within the given source.

\ No newline at end of file + \ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index e309337..099d479 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -335,34 +335,6 @@ Examples: ``` -**`relatedConcepts[0].url`** _(string)_ - The url that links out to the published, standardized concept. - - -**`relatedConcepts[0].type`** _(string)_ - The **type** of mapping to a published set of concepts related to the given field such as -ontological information (eg., NCI thesaurus, bioportal etc) - - -**`relatedConcepts[0].label`** _(string)_ - A free text **label** of mapping to a published set of concepts related to the given field such as -ontological information (eg., NCI thesaurus, bioportal etc) - - -**`relatedConcepts[0].source`** _(string)_ - The source of the related concept. - -Examples: - - -``` - TBD (will have controlled vocabulary) - -``` - -**`relatedConcepts[0].id`** _(string)_ - The id locating the individual mapping within the given source. - ## End of schema - Additional Property information diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index bdb7f87..daff87d 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -394,11 +394,6 @@ Two separate records. If desired, multiple standard mappings can be entered, say ``` -**`relatedConcepts`** _(array)_ - __**[Under development]**__ Mappings to a published set of concepts related to the given field such as -ontological information (eg., NCI thesaurus, bioportal etc) - - ### Additional `fields` property information #### `type` enum definitions: diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 71d49fd..011eb58 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -67,17 +67,17 @@ "type": "string", "constraints": { "enum": [ + "time", "geopoint", + "any", "date", - "integer", - "boolean", - "duration", "datetime", + "year", "number", + "integer", + "duration", + "boolean", "string", - "time", - "any", - "year", "yearmonth" ] } @@ -253,39 +253,6 @@ "C74457" ], "type": "string" - }, - { - "name": "relatedConcepts[0].url", - "description": "The url that links out to the published, standardized concept.\n", - "title": "Related Concepts - Url", - "type": "string" - }, - { - "name": "relatedConcepts[0].type", - "description": "The **type** of mapping to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n", - "title": "Related concepts - Type", - "type": "string" - }, - { - "name": "relatedConcepts[0].label", - "description": "A free text **label** of mapping to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n", - "title": "Related Concepts - Label", - "type": "string" - }, - { - "name": "relatedConcepts[0].source", - "description": "The source of the related concept.\n", - "title": "Related Concepts - Source", - "examples": [ - "TBD (will have controlled vocabulary)" - ], - "type": "string" - }, - { - "name": "relatedConcepts[0].id", - "description": "The id locating the individual mapping within the given source.", - "title": "Related Concepts - Id", - "type": "string" } ], "missingValues": [ diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index 7aeacd8..62bd1e9 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -218,35 +218,6 @@ "examples": [ "C74457" ] - }, - "relatedConcepts[0].url": { - "title": "Related Concepts - Url", - "description": "The url that links out to the published, standardized concept.\n", - "type": "string", - "format": "uri" - }, - "relatedConcepts[0].type": { - "title": "Related concepts - Type", - "description": "The **type** of mapping to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n", - "type": "string" - }, - "relatedConcepts[0].label": { - "type": "string", - "title": "Related Concepts - Label", - "description": "A free text **label** of mapping to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n" - }, - "relatedConcepts[0].source": { - "title": "Related Concepts - Source", - "description": "The source of the related concept.\n", - "type": "string", - "examples": [ - "TBD (will have controlled vocabulary)" - ] - }, - "relatedConcepts[0].id": { - "title": "Related Concepts - Id", - "type": "string", - "description": "The id locating the individual mapping within the given source." } } } \ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 31e5e15..5a490ab 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -361,45 +361,6 @@ } } } - }, - "relatedConcepts": { - "title": "Related Concepts", - "description": "__**[Under development]**__ Mappings to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n", - "type": "array", - "items": { - "type": "object", - "properties": { - "url": { - "title": "Related Concepts - Url", - "description": "The url that links out to the published, standardized concept.\n", - "type": "string", - "format": "uri" - }, - "type": { - "title": "Related concepts - Type", - "description": "The **type** of mapping to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n", - "type": "string" - }, - "label": { - "type": "string", - "title": "Related Concepts - Label", - "description": "A free text **label** of mapping to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n" - }, - "source": { - "title": "Related Concepts - Source", - "description": "The source of the related concept.\n", - "type": "string", - "examples": [ - "TBD (will have controlled vocabulary)" - ] - }, - "id": { - "title": "Related Concepts - Id", - "type": "string", - "description": "The id locating the individual mapping within the given source." - } - } - } } } } diff --git a/variable-level-metadata-schema/templates/template_submission.csv b/variable-level-metadata-schema/templates/template_submission.csv index 10d8f50..fb44378 100644 --- a/variable-level-metadata-schema/templates/template_submission.csv +++ b/variable-level-metadata-schema/templates/template_submission.csv @@ -1 +1 @@ -schemaVersion,section,name,title,description,type,format,constraints.required,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id,relatedConcepts[0].url,relatedConcepts[0].type,relatedConcepts[0].label,relatedConcepts[0].source,relatedConcepts[0].id \ No newline at end of file +schemaVersion,section,name,title,description,type,format,constraints.required,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id \ No newline at end of file diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index bd0d36b..86b8685 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -52,15 +52,6 @@ "id": null } } - ], - "relatedConcepts": [ - { - "url": null, - "type": null, - "label": null, - "source": null, - "id": null - } ] } ] From 9a760106af53a5a0630dd28205c82d841d7e266d Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 16 Jan 2024 21:49:29 -0600 Subject: [PATCH 55/72] minor formatting changes --- variable-level-metadata-schema/README.md | 32 +++++++++---------- .../schemas/dictionary/fields.yaml | 8 ++--- 2 files changed, 19 insertions(+), 21 deletions(-) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index 17dca71..1360799 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -27,32 +27,30 @@ This metadata directory contains the specifications for variable level metadata ```mermaid -%%{init: {"flowchart": {"defaultRenderer": "elk","htmlLabels": false}} }%% -flowchart TD - subgraph "/schemas" - subgraph dictionary[Dictionary YAML files] + %%{init: {"flowchart": {"defaultRenderer": "elk","htmlLabels": false}} }%% - defs["/dictionary/definitions.yaml"] - fields["/dictionary/fields.yaml"] - dd["/dictionary/data-dictionary.yaml"] - end + flowchart TD - subgraph Schema specifications + subgraph dictionary[Dictionary YAML files] - jsonspec["/jsontemplate/data-dictionary.json"] - csvspec["/jsontemplate/csvtemplate/fields.json"] - csvtblspec["/frictionless/csvtemplate/fields.json"] - end - end + defs["schemas/dictionary/definitions.yaml"] + fields["schemas/dictionary/fields.yaml"] + dd["schemas/dictionary/data-dictionary.yaml"] + end + + subgraph Schema specifications + + jsonspec["schema/jsontemplate/data-dictionary.json"] + csvspec["schema/jsontemplate/csvtemplate/fields.json"] + csvtblspec["schema/frictionless/csvtemplate/fields.json"] + end - subgraph /docs subgraph "Rendered schema documentation \n(html also available)" csvmd["/docs/\nmd-rendered-schemas/\njsonschema-csvtemplate-fields.md"] jsonmd["/docs/\nmd-rendered-schemas/\njsonschema-jsontemplate-data-dictionary.md"] end - end defs --> fields --> dd defs --> dd @@ -186,7 +184,7 @@ More concretely, this provides a way to specify root level properties within vlm 1. specifying the schema version that represents the vlmd document (`schemaVersion`) 2. specifying other data dictionary level properties such as `standardsMappings[0].instrument` -### root and field property cascading pattern +### root ("data dictionary level") and field property cascading pattern Akin to the above json to csv, more generally: All root level properties will be applied to individual fields IF this same field level property is not specified (i.e., field-level takes precedence). This strategy can be seen in the [data package standard (but with missingValues)](https://specs.frictionlessdata.io/patterns/#missing-values-per-field) diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 1d3b156..9fe8686 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -4,10 +4,10 @@ description: | !!! note "Highly encouraged" - Only `name` and `description` properties are required. - For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. - For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged. - `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables) + - Only `name` and `description` properties are required. + - For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. + - For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged. + - `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables) type: object required: - name From ca23120a38b121928534b0fb5d689f5e8f4c6014 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 16 Jan 2024 21:51:55 -0600 Subject: [PATCH 56/72] Update build --- .../jsonschema-csvtemplate-fields.html | 4 ++-- .../jsonschema-jsontemplate-data-dictionary.html | 4 ++-- .../jsonschema-csvtemplate-fields.md | 8 ++++---- .../jsonschema-jsontemplate-data-dictionary.md | 8 ++++---- .../schemas/frictionless/csvtemplate/fields.json | 16 ++++++++-------- .../schemas/jsonschema/csvtemplate/fields.json | 2 +- .../schemas/jsonschema/data-dictionary.json | 2 +- 7 files changed, 22 insertions(+), 22 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index ab5f5e3..9d25853 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -1,4 +1,4 @@ - HEAL Variable Level Metadata Fields

HEAL Variable Level Metadata Fields

Type: object

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
+ HEAL Variable Level Metadata Fields 

HEAL Variable Level Metadata Fields

Type: object

!!! note "Highly encouraged"

  • Only name and description properties are required.
  • For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
  • For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
  • type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
 
"0.2.0"
 

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
@@ -24,4 +24,4 @@
 

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-
\ No newline at end of file +
\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 8b77d36..5a08ba0 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -4,7 +4,7 @@

Type: enum (of string)

An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)
containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
 
"adult-demographics"
 

Type: string

A code or other string that identifies the instrument within the source.
This should always be from the source's formal, standardized identification system


Example:

"5141"
-

Type: array of object

Each item of this array must be:

Type: object

!!! note "Highly encouraged"

Only name and description properties are required.
For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
+

Type: array of object

Each item of this array must be:

Type: object

!!! note "Highly encouraged"

  • Only name and description properties are required.
  • For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
  • For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
  • type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
 
"0.2.0"
 

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
@@ -126,4 +126,4 @@
 

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-
\ No newline at end of file + \ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index 099d479..a784c50 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -14,10 +14,10 @@ The aim of this HEAL metadata piece is to track and provide basic information ab !!! note "Highly encouraged" - Only `name` and `description` properties are required. - For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. - For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged. - `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables) + - Only `name` and `description` properties are required. + - For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. + - For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged. + - `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables) ## Properties (i.e., fields or variables) diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index daff87d..8ef83bd 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -41,10 +41,10 @@ A set of standardized instruments linked to all variables within the `fields` pr !!! note "Highly encouraged" - Only `name` and `description` properties are required. - For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. - For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged. - `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables) + - Only `name` and `description` properties are required. + - For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. + - For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged. + - `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables) ### Properties for each `fields` record diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 011eb58..6b900b0 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -1,6 +1,6 @@ { "version": "0.2.0", - "description": "\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", + "description": "\n\n!!! note \"Highly encouraged\"\n\n - Only `name` and `description` properties are required. \n - For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n - For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n - `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "title": "HEAL Variable Level Metadata Fields", "fields": [ { @@ -67,18 +67,18 @@ "type": "string", "constraints": { "enum": [ - "time", + "duration", + "yearmonth", "geopoint", + "datetime", "any", "date", - "datetime", - "year", + "string", "number", "integer", - "duration", - "boolean", - "string", - "yearmonth" + "year", + "time", + "boolean" ] } }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index 62bd1e9..72b8bb1 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -1,7 +1,7 @@ { "version": "0.2.0", "title": "HEAL Variable Level Metadata Fields", - "description": "\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", + "description": "\n\n!!! note \"Highly encouraged\"\n\n - Only `name` and `description` properties are required. \n - For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n - For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n - `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "type": "object", "required": [ "name", diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 5a490ab..979fbf1 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -82,7 +82,7 @@ "type": "array", "items": { "title": "HEAL Variable Level Metadata Fields", - "description": "\n\n!!! note \"Highly encouraged\"\n\n Only `name` and `description` properties are required. \n For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", + "description": "\n\n!!! note \"Highly encouraged\"\n\n - Only `name` and `description` properties are required. \n - For categorical variables, `constraints.enum` and `enumLabels` (where applicable) properties are highly encouraged. \n - For studies using HEAL or other common data elements (CDEs), `standardsMappings` information is highly encouraged.\n - `type` and `format` properties may be particularly useful for some variable types (e.g. date-like variables)\n", "type": "object", "required": [ "name", From 15e597e56d7d51e819054773bc115f417b206eec Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 22 Jan 2024 10:34:26 -0600 Subject: [PATCH 57/72] added tbl and field lvl 'extra' properties --- variable-level-metadata-schema/README.md | 18 +++++++++++++++++- .../schemas/dictionary/data-dictionary.yaml | 13 ++++++++++++- .../schemas/dictionary/fields.yaml | 10 +++++++++- 3 files changed, 38 insertions(+), 3 deletions(-) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index 1360799..90200c2 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -197,4 +197,20 @@ have the same stem name with corresponding "csv" and "json" suffixes (eg `my-hea ## Considerations -Please use github issues for any additional considerations. See additional comments above. \ No newline at end of file +Please use github issues for any additional considerations. See additional comments above. + + +## Additional table-level (root) and field-level properties + +Some table-level or field-level properties in other standards (or custom properties in specific use cases) do not map onto +a core HEAL property. To allow these properties to be included, we list these property names under `propertyNames`. + + ❗ For study or use case specific names, it is recommended to put the property under a `custom` namespace (e.g., `"custom":{"myvarname"})`. Adding additional properties here are for well established standards and/or property names used in practice. + + ☝️ The use of [`propertyNames`](https://json-schema.org/draft-07/json-schema-validation#rfc.section.6.5.8) was used to: + + 1. allow inclusion and minimal validation of these extra properties (ie of only the existence of property names) without making any assumptions about corresponding property types. + 2. It also provides a clear distinction between "core" properties and "extra" properties. + + One consideration, however, is that `propertyNames` was introduced in json schema draft-6. + diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index 225313f..f413a81 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -23,4 +23,15 @@ properties: fields: type: array items: - $ref: "#/fields" \ No newline at end of file + $ref: "#/fields" + + +propertyNames: + description: | + Additional properties for compatibility with other standards at the "table" , or root, but not included in the core `properties` set: + + [Frictionless Data package table schema standard)](https://specs.frictionlessdata.io/table-schema): `missingValues`|`primaryKey`|`foreignKeys` + enum: + - missingValues + - primaryKey + - foreignKeys diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 9fe8686..fd550c1 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -237,4 +237,12 @@ properties: - ["Not required","NOT REQUIRED"] - ["No"] standardsMappings: - $ref: "#/definitions/fieldStandardsMappingsItem" \ No newline at end of file + $ref: "#/definitions/fieldStandardsMappingsItem" + + +propertyNames: + enum: + - custom + description: | + Additional properties for compatibility with other standards or common properties that are not included a core field property. + Note, custom is included as this is the \ No newline at end of file From c28f05fff7f650e9e86c56380002ac02e01fa3de Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 22 Jan 2024 10:39:41 -0600 Subject: [PATCH 58/72] update build --- .../jsonschema-csvtemplate-fields.html | 2 +- .../jsonschema-jsontemplate-data-dictionary.html | 2 +- .../schemas/frictionless/csvtemplate/fields.json | 14 +++++++------- .../schemas/jsonschema/csvtemplate/fields.json | 6 ++++++ .../schemas/jsonschema/data-dictionary.json | 14 ++++++++++++++ 5 files changed, 29 insertions(+), 9 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 9d25853..9972763 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -24,4 +24,4 @@

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-
\ No newline at end of file + \ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 5a08ba0..126d147 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -126,4 +126,4 @@

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-
\ No newline at end of file + \ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 6b900b0..719a8b5 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -67,18 +67,18 @@ "type": "string", "constraints": { "enum": [ + "time", + "datetime", "duration", - "yearmonth", "geopoint", - "datetime", - "any", - "date", - "string", "number", "integer", + "string", + "any", + "yearmonth", "year", - "time", - "boolean" + "boolean", + "date" ] } }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index 72b8bb1..a4bfcca 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -219,5 +219,11 @@ "C74457" ] } + }, + "propertyNames": { + "enum": [ + "custom" + ], + "description": "Additional properties for compatibility with other standards or common properties that are not included a core field property. \nNote, custom is included as this is the" } } \ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 979fbf1..8f6073b 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -362,8 +362,22 @@ } } } + }, + "propertyNames": { + "enum": [ + "custom" + ], + "description": "Additional properties for compatibility with other standards or common properties that are not included a core field property. \nNote, custom is included as this is the" } } } + }, + "propertyNames": { + "description": "Additional properties for compatibility with other standards at the \"table\" , or root, but not included in the core `properties` set:\n\n[Frictionless Data package table schema standard)](https://specs.frictionlessdata.io/table-schema): `missingValues`|`primaryKey`|`foreignKeys`\n", + "enum": [ + "missingValues", + "primaryKey", + "foreignKeys" + ] } } \ No newline at end of file From 73b865f92f94177392a659ecbec838600980a1e3 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 22 Jan 2024 13:08:23 -0600 Subject: [PATCH 59/72] fix: patternNames is AND not OR --- .../schemas/dictionary/data-dictionary.yaml | 11 ++++++++++- .../schemas/dictionary/fields.yaml | 12 ++++-------- 2 files changed, 14 insertions(+), 9 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index f413a81..3ca153b 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -28,10 +28,19 @@ properties: propertyNames: description: | - Additional properties for compatibility with other standards at the "table" , or root, but not included in the core `properties` set: + To allow additional properties for compatibility with other standards at the "table" , or root, but not included in the core `properties` set: [Frictionless Data package table schema standard)](https://specs.frictionlessdata.io/table-schema): `missingValues`|`primaryKey`|`foreignKeys` enum: + # core properties + - title + - description + - schemaVersion + - version + - standardsMappings + # custom properties + - custom + # custom properties but a part of standards - missingValues - primaryKey - foreignKeys diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index fd550c1..3373a7c 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -238,11 +238,7 @@ properties: - ["No"] standardsMappings: $ref: "#/definitions/fieldStandardsMappingsItem" - - -propertyNames: - enum: - - custom - description: | - Additional properties for compatibility with other standards or common properties that are not included a core field property. - Note, custom is included as this is the \ No newline at end of file + custom: + type: object + description: | + Additional properties not included a core field property. \ No newline at end of file From d79b87c69d8c1a743da6e527a71c29066f84f5eb Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 22 Jan 2024 13:20:16 -0600 Subject: [PATCH 60/72] remove trueValue type constraint --- variable-level-metadata-schema/schemas/dictionary/fields.yaml | 2 -- 1 file changed, 2 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 3373a7c..3d58853 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -221,8 +221,6 @@ properties: readability of the field). It can include one or more values. type: array - items: - type: string examples: - ["required","Yes","Checked"] - ["required"] From 943f8c1c6cbd6e9309ede2a3168d09b2441cb15b Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 22 Jan 2024 13:20:24 -0600 Subject: [PATCH 61/72] Update build --- .../jsonschema-csvtemplate-fields.html | 2 +- ...onschema-jsontemplate-data-dictionary.html | 4 +-- .../jsonschema-csvtemplate-fields.md | 3 +++ ...jsonschema-jsontemplate-data-dictionary.md | 3 +++ .../frictionless/csvtemplate/fields.json | 26 ++++++++++++------- .../jsonschema/csvtemplate/fields.json | 14 ++++------ .../schemas/jsonschema/data-dictionary.json | 21 ++++++++------- .../templates/template_submission.csv | 2 +- .../templates/template_submission.json | 7 +++-- 9 files changed, 46 insertions(+), 36 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 9972763..92faab5 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -24,4 +24,4 @@

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-
\ No newline at end of file +

Type: string

Additional properties not included a core field property.

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 126d147..e86e3b2 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -47,7 +47,7 @@
[
     "Missing"
 ]
-

Type: array of string

For boolean (true) variable (as defined in type field), this field allows
a physical string representation to be cast as true (increasing
readability of the field). It can include one or more values.

Each item of this array must be:


Examples:

[
+

Type: array

For boolean (true) variable (as defined in type field), this field allows
a physical string representation to be cast as true (increasing
readability of the field). It can include one or more values.


Examples:

[
     "required",
     "Yes",
     "Checked"
@@ -126,4 +126,4 @@
 

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-
\ No newline at end of file +

Type: object

Additional properties not included a core field property.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index a784c50..0693519 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -335,6 +335,9 @@ Examples: ``` +**`custom`** _(string)_ + Additional properties not included a core field property. + ## End of schema - Additional Property information diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index 8ef83bd..f4cbd9c 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -394,6 +394,9 @@ Two separate records. If desired, multiple standard mappings can be entered, say ``` +**`custom`** _(object)_ + Additional properties not included a core field property. + ### Additional `fields` property information #### `type` enum definitions: diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 719a8b5..d5cde2c 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -67,18 +67,18 @@ "type": "string", "constraints": { "enum": [ - "time", - "datetime", - "duration", - "geopoint", + "yearmonth", "number", - "integer", - "string", + "geopoint", "any", - "yearmonth", + "date", "year", - "boolean", - "date" + "duration", + "time", + "datetime", + "string", + "integer", + "boolean" ] } }, @@ -253,6 +253,14 @@ "C74457" ], "type": "string" + }, + { + "name": "custom", + "description": "Additional properties not included a core field property. ", + "type": "string", + "constraints": { + "pattern": "^(?:.*?=.*?(?:\\||$))+$" + } } ], "missingValues": [ diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index a4bfcca..e68b71e 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -142,9 +142,6 @@ "title": "Boolean True Value Labels", "description": "For boolean (true) variable (as defined in type field), this field allows\na physical string representation to be cast as true (increasing\nreadability of the field). It can include one or more values.\n", "type": "string", - "items": { - "type": "string" - }, "examples": [ "required|Yes|Checked", "required" @@ -218,12 +215,11 @@ "examples": [ "C74457" ] + }, + "custom": { + "type": "string", + "description": "Additional properties not included a core field property. ", + "pattern": "^(?:.*?=.*?(?:\\||$))+$" } - }, - "propertyNames": { - "enum": [ - "custom" - ], - "description": "Additional properties for compatibility with other standards or common properties that are not included a core field property. \nNote, custom is included as this is the" } } \ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 8f6073b..4b247d2 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -253,9 +253,6 @@ "title": "Boolean True Value Labels", "description": "For boolean (true) variable (as defined in type field), this field allows\na physical string representation to be cast as true (increasing\nreadability of the field). It can include one or more values.\n", "type": "array", - "items": { - "type": "string" - }, "examples": [ [ "required", @@ -361,20 +358,24 @@ } } } + }, + "custom": { + "type": "object", + "description": "Additional properties not included a core field property. " } - }, - "propertyNames": { - "enum": [ - "custom" - ], - "description": "Additional properties for compatibility with other standards or common properties that are not included a core field property. \nNote, custom is included as this is the" } } } }, "propertyNames": { - "description": "Additional properties for compatibility with other standards at the \"table\" , or root, but not included in the core `properties` set:\n\n[Frictionless Data package table schema standard)](https://specs.frictionlessdata.io/table-schema): `missingValues`|`primaryKey`|`foreignKeys`\n", + "description": "To allow additional properties for compatibility with other standards at the \"table\" , or root, but not included in the core `properties` set:\n\n[Frictionless Data package table schema standard)](https://specs.frictionlessdata.io/table-schema): `missingValues`|`primaryKey`|`foreignKeys`\n", "enum": [ + "title", + "description", + "schemaVersion", + "version", + "standardsMappings", + "custom", "missingValues", "primaryKey", "foreignKeys" diff --git a/variable-level-metadata-schema/templates/template_submission.csv b/variable-level-metadata-schema/templates/template_submission.csv index fb44378..3d0b94d 100644 --- a/variable-level-metadata-schema/templates/template_submission.csv +++ b/variable-level-metadata-schema/templates/template_submission.csv @@ -1 +1 @@ -schemaVersion,section,name,title,description,type,format,constraints.required,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id \ No newline at end of file +schemaVersion,section,name,title,description,type,format,constraints.required,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id,custom \ No newline at end of file diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index 86b8685..be01251 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -34,9 +34,7 @@ "enumLabels": {}, "enumOrdered": null, "missingValues": [], - "trueValues": [ - {} - ], + "trueValues": [], "falseValues": [], "standardsMappings": [ { @@ -52,7 +50,8 @@ "id": null } } - ] + ], + "custom": {} } ] } From 6345a72fd8875c0a04365eea3fb527e537d044e1 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 22 Jan 2024 13:22:40 -0600 Subject: [PATCH 62/72] fix --- .../schemas/dictionary/data-dictionary.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index 3ca153b..3306bd7 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -38,6 +38,7 @@ propertyNames: - schemaVersion - version - standardsMappings + - fields # custom properties - custom # custom properties but a part of standards From cad5bec7fd7e066da996aae6a0b28bc7cc8ed0d6 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Mon, 22 Jan 2024 13:22:58 -0600 Subject: [PATCH 63/72] update build --- .../jsonschema-csvtemplate-fields.html | 2 +- .../jsonschema-jsontemplate-data-dictionary.html | 2 +- .../schemas/frictionless/csvtemplate/fields.json | 16 ++++++++-------- .../schemas/jsonschema/data-dictionary.json | 1 + 4 files changed, 11 insertions(+), 10 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 92faab5..4c64c0b 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -24,4 +24,4 @@

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-

Type: string

Additional properties not included a core field property.

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
\ No newline at end of file +

Type: string

Additional properties not included a core field property.

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index e86e3b2..98fc0c5 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -126,4 +126,4 @@

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-

Type: object

Additional properties not included a core field property.

\ No newline at end of file +

Type: object

Additional properties not included a core field property.

\ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index d5cde2c..b9afb9f 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -67,18 +67,18 @@ "type": "string", "constraints": { "enum": [ + "string", + "integer", + "time", + "any", "yearmonth", + "datetime", + "year", + "boolean", "number", "geopoint", - "any", "date", - "year", - "duration", - "time", - "datetime", - "string", - "integer", - "boolean" + "duration" ] } }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 4b247d2..4a5f3aa 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -375,6 +375,7 @@ "schemaVersion", "version", "standardsMappings", + "fields", "custom", "missingValues", "primaryKey", From 46418da1fb8e01804daba3dee7aa7fa5e7c09666 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 23 Jan 2024 09:31:19 -0600 Subject: [PATCH 64/72] add custom property to dd level --- .../schemas/dictionary/data-dictionary.yaml | 6 ++++-- .../schemas/dictionary/fields.yaml | 2 +- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index 3306bd7..bc9fd46 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -20,12 +20,14 @@ properties: description: The specified individual data dictionary instance version. standardsMappings: $ref: "#/definitions/rootStandardsMappingsItem" + custom: + type: object + description: | + Additional properties not included as a core property. fields: type: array items: $ref: "#/fields" - - propertyNames: description: | To allow additional properties for compatibility with other standards at the "table" , or root, but not included in the core `properties` set: diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 3d58853..058e829 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -239,4 +239,4 @@ properties: custom: type: object description: | - Additional properties not included a core field property. \ No newline at end of file + Additional properties not included a core property. \ No newline at end of file From d14cd73d41cd683fb482d06bc79a323abd4ed3fd Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 23 Jan 2024 09:31:37 -0600 Subject: [PATCH 65/72] Update build --- .../jsonschema-csvtemplate-fields.html | 2 +- ...sonschema-jsontemplate-data-dictionary.html | 4 ++-- .../jsonschema-csvtemplate-fields.md | 2 +- .../jsonschema-jsontemplate-data-dictionary.md | 5 ++++- .../frictionless/csvtemplate/fields.json | 18 +++++++++--------- .../schemas/jsonschema/csvtemplate/fields.json | 2 +- .../schemas/jsonschema/data-dictionary.json | 6 +++++- .../templates/template_submission.json | 1 + 8 files changed, 24 insertions(+), 16 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 4c64c0b..7e75a7c 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -24,4 +24,4 @@

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-

Type: string

Additional properties not included a core field property.

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
\ No newline at end of file +

Type: string

Additional properties not included a core property.

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 98fc0c5..98abe97 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -4,7 +4,7 @@

Type: enum (of string)

An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)
containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
 
"adult-demographics"
 

Type: string

A code or other string that identifies the instrument within the source.
This should always be from the source's formal, standardized identification system


Example:

"5141"
-

Type: array of object

Each item of this array must be:

Type: object

!!! note "Highly encouraged"

  • Only name and description properties are required.
  • For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
  • For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
  • type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
+

Type: object

Additional properties not included as a core property.

Type: array of object

Each item of this array must be:

Type: object

!!! note "Highly encouraged"

  • Only name and description properties are required.
  • For categorical variables, constraints.enum and enumLabels (where applicable) properties are highly encouraged.
  • For studies using HEAL or other common data elements (CDEs), standardsMappings information is highly encouraged.
  • type and format properties may be particularly useful for some variable types (e.g. date-like variables)

Type: string

The version of the schema used in agreed upon convention of major.minor.path (e.g., 1.0.2)

NOTE: This is NOT for versioning of each indiviual data dictionary instance.
Rather, it is the
version of THIS schema document. See version property (below) if specifying the individual data dictionary instance
version.

If generating a vlmd document as a csv file, include this version in
every row/record to indicate this is a schema level property
(not applicable for the json version as this property is already at the schema/root level)

Must match regular expression: \d+\.\d+\.\d+
Examples:

"1.0.0"
 
"0.2.0"
 

Type: string

The section, form, survey instrument, set of measures or other broad category used
to group variables. Previously called "module."


Examples:

"Demographics"
 
"PROMIS"
@@ -126,4 +126,4 @@
 

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-

Type: object

Additional properties not included a core field property.

\ No newline at end of file +

Type: object

Additional properties not included a core property.

\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index 0693519..ef18056 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -336,7 +336,7 @@ Examples: ``` **`custom`** _(string)_ - Additional properties not included a core field property. + Additional properties not included a core property. ## End of schema - Additional Property information diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index f4cbd9c..1df660a 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -35,6 +35,9 @@ A set of standardized instruments linked to all variables within the `fields` pr easier to understand in the same way other standards implement cascading (e.g., `missingValues` in the [frictionless specification](https://specs.frictionlessdata.io/patterns/#missing-values-per-field)) +## `custom` _(object)_ +Additional properties not included as a core property. + ## `fields` _(array,required)_ @@ -395,7 +398,7 @@ Two separate records. If desired, multiple standard mappings can be entered, say **`custom`** _(object)_ - Additional properties not included a core field property. + Additional properties not included a core property. ### Additional `fields` property information diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index b9afb9f..1d6a76c 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -67,18 +67,18 @@ "type": "string", "constraints": { "enum": [ - "string", - "integer", - "time", - "any", "yearmonth", - "datetime", - "year", "boolean", - "number", "geopoint", "date", - "duration" + "number", + "year", + "any", + "datetime", + "duration", + "integer", + "string", + "time" ] } }, @@ -256,7 +256,7 @@ }, { "name": "custom", - "description": "Additional properties not included a core field property. ", + "description": "Additional properties not included a core property. ", "type": "string", "constraints": { "pattern": "^(?:.*?=.*?(?:\\||$))+$" diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index e68b71e..ab6ee2a 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -218,7 +218,7 @@ }, "custom": { "type": "string", - "description": "Additional properties not included a core field property. ", + "description": "Additional properties not included a core property. ", "pattern": "^(?:.*?=.*?(?:\\||$))+$" } } diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 4a5f3aa..a44e02c 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -78,6 +78,10 @@ } } }, + "custom": { + "type": "object", + "description": "Additional properties not included as a core property. \n" + }, "fields": { "type": "array", "items": { @@ -361,7 +365,7 @@ }, "custom": { "type": "object", - "description": "Additional properties not included a core field property. " + "description": "Additional properties not included a core property. " } } } diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index be01251..5a7c0c6 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -14,6 +14,7 @@ } } ], + "custom": {}, "fields": [ { "schemaVersion": null, From 877dfbc5d3b5c074655ebb71dd9a0d5057ca1d44 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 23 Jan 2024 09:37:36 -0600 Subject: [PATCH 66/72] Update README --- variable-level-metadata-schema/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index 90200c2..acd7474 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -207,7 +207,7 @@ a core HEAL property. To allow these properties to be included, we list these pr ❗ For study or use case specific names, it is recommended to put the property under a `custom` namespace (e.g., `"custom":{"myvarname"})`. Adding additional properties here are for well established standards and/or property names used in practice. - ☝️ The use of [`propertyNames`](https://json-schema.org/draft-07/json-schema-validation#rfc.section.6.5.8) was used to: + ☝️ At the root level, [`propertyNames`](https://json-schema.org/draft-07/json-schema-validation#rfc.section.6.5.8) was used to: 1. allow inclusion and minimal validation of these extra properties (ie of only the existence of property names) without making any assumptions about corresponding property types. 2. It also provides a clear distinction between "core" properties and "extra" properties. From 2481437811406230e53779b7d57eb5f8bf339dc2 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 23 Jan 2024 09:47:47 -0600 Subject: [PATCH 67/72] add underscore to designate definitions are more for internal process --- variable-level-metadata-schema/build.py | 2 -- .../dictionary/{definitions.yaml => _definitions.yaml} | 4 ++-- .../schemas/dictionary/data-dictionary.yaml | 6 +++--- .../schemas/dictionary/fields.yaml | 4 ++-- 4 files changed, 7 insertions(+), 9 deletions(-) rename variable-level-metadata-schema/schemas/dictionary/{definitions.yaml => _definitions.yaml} (98%) diff --git a/variable-level-metadata-schema/build.py b/variable-level-metadata-schema/build.py index 89447aa..8b8eb74 100644 --- a/variable-level-metadata-schema/build.py +++ b/variable-level-metadata-schema/build.py @@ -24,8 +24,6 @@ def load_yaml(filepath): yamlfile = yaml.safe_load(f) return yamlfile - -test = load_yaml("schemas/dictionary/definitions.yaml") # load all yamls def load_all_yamls(directory="schemas/dictionary"): filepaths = Path(directory).glob("*.yaml") diff --git a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml b/variable-level-metadata-schema/schemas/dictionary/_definitions.yaml similarity index 98% rename from variable-level-metadata-schema/schemas/dictionary/definitions.yaml rename to variable-level-metadata-schema/schemas/dictionary/_definitions.yaml index 1115c45..0914c33 100644 --- a/variable-level-metadata-schema/schemas/dictionary/definitions.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/_definitions.yaml @@ -90,7 +90,7 @@ rootStandardsMappingsItem: properties: type: object instrument: - $ref: "#/definitions/standardsMappingsInstrumentObject" + $ref: "#/_definitions/standardsMappingsInstrumentObject" fieldStandardsMappingsItem: @@ -194,7 +194,7 @@ fieldStandardsMappingsItem: type: object properties: instrument: - $ref: "#/definitions/standardsMappingsInstrumentObject" + $ref: "#/_definitions/standardsMappingsInstrumentObject" item: diff --git a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml index bc9fd46..b99ab2f 100644 --- a/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/data-dictionary.yaml @@ -13,13 +13,13 @@ properties: description: type: string schemaVersion: - $ref: "#/definitions/schemaVersion" + $ref: "#/_definitions/schemaVersion" version: # TODO: think about having a version text/message and id (akin to a git commit) type: string description: The specified individual data dictionary instance version. standardsMappings: - $ref: "#/definitions/rootStandardsMappingsItem" + $ref: "#/_definitions/rootStandardsMappingsItem" custom: type: object description: | @@ -32,7 +32,7 @@ propertyNames: description: | To allow additional properties for compatibility with other standards at the "table" , or root, but not included in the core `properties` set: - [Frictionless Data package table schema standard)](https://specs.frictionlessdata.io/table-schema): `missingValues`|`primaryKey`|`foreignKeys` + [Frictionless Data package table schema standard](https://specs.frictionlessdata.io/table-schema): `missingValues`|`primaryKey`|`foreignKeys` enum: # core properties - title diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 058e829..3d4ca5f 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -14,7 +14,7 @@ required: - description properties: schemaVersion: - $ref: "#/definitions/schemaVersion" + $ref: "#/_definitions/schemaVersion" section: type: string title: Section @@ -235,7 +235,7 @@ properties: - ["Not required","NOT REQUIRED"] - ["No"] standardsMappings: - $ref: "#/definitions/fieldStandardsMappingsItem" + $ref: "#/_definitions/fieldStandardsMappingsItem" custom: type: object description: | From 7d5fa212a3bdd735bd5ae109990cae8b82504e94 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 23 Jan 2024 09:49:17 -0600 Subject: [PATCH 68/72] Update build --- .../jsonschema-csvtemplate-fields.html | 2 +- .../jsonschema-jsontemplate-data-dictionary.html | 2 +- .../schemas/frictionless/csvtemplate/fields.json | 14 +++++++------- .../schemas/jsonschema/data-dictionary.json | 2 +- 4 files changed, 10 insertions(+), 10 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index 7e75a7c..cbc1eb4 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -24,4 +24,4 @@

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-

Type: string

Additional properties not included a core property.

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
\ No newline at end of file +

Type: string

Additional properties not included a core property.

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 98abe97..a21bafa 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -126,4 +126,4 @@

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-

Type: object

Additional properties not included a core property.

\ No newline at end of file +

Type: object

Additional properties not included a core property.

\ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 1d6a76c..32acae9 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -67,18 +67,18 @@ "type": "string", "constraints": { "enum": [ - "yearmonth", - "boolean", "geopoint", + "string", + "boolean", + "integer", + "time", "date", - "number", "year", "any", - "datetime", + "yearmonth", "duration", - "integer", - "string", - "time" + "datetime", + "number" ] } }, diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index a44e02c..89a8748 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -372,7 +372,7 @@ } }, "propertyNames": { - "description": "To allow additional properties for compatibility with other standards at the \"table\" , or root, but not included in the core `properties` set:\n\n[Frictionless Data package table schema standard)](https://specs.frictionlessdata.io/table-schema): `missingValues`|`primaryKey`|`foreignKeys`\n", + "description": "To allow additional properties for compatibility with other standards at the \"table\" , or root, but not included in the core `properties` set:\n\n[Frictionless Data package table schema standard](https://specs.frictionlessdata.io/table-schema): `missingValues`|`primaryKey`|`foreignKeys`\n", "enum": [ "title", "description", From 3686b3fbd2de7691fe10f6a6337633807935fcdc Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 23 Jan 2024 14:04:06 -0600 Subject: [PATCH 69/72] Added back an updated relatedConcepts property --- .../schemas/dictionary/_definitions.yaml | 55 +++++++++++++++++++ .../schemas/dictionary/fields.yaml | 8 ++- 2 files changed, 60 insertions(+), 3 deletions(-) diff --git a/variable-level-metadata-schema/schemas/dictionary/_definitions.yaml b/variable-level-metadata-schema/schemas/dictionary/_definitions.yaml index 0914c33..e34bae3 100644 --- a/variable-level-metadata-schema/schemas/dictionary/_definitions.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/_definitions.yaml @@ -229,3 +229,58 @@ fieldStandardsMappingsItem: examples: - "C74457" +relatedConcepts: + title: Related Concepts + description: | + __**[Under development]**__ Mappings to a published set of concepts related to the given field such as + ontological information (eg., NCI thesaurus, bioportal etc) + type: array + items: + type: object + properties: + url: + title: Related Concepts - Url + description: | + The url that links out to the published, related concept. + The listed examples could both be attached to any variable related to, for example, heroin use. + + > :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ + type: string + format: uri + examples: + - https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808 + - http://purl.bioontology.org/ontology/RXNORM/3304 + + title: + title: Related concepts - Type + description: | + A human-readable title (ie label) to a concept related to the given field. + The listed examples could both be attached to any variable related to, for example, heroin use. + + > :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ + type: string + examples: + - Heroin Molecular Structure + - Heroin Ontology + source: + title: Related Concepts - Source + description: | + The source (e.g., a dictionary or vocabulary set) to a concept related to the given field. + The listed examples could both be attached to any variable related to, for example, heroin use. + + > :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ + type: string + examples: + - CHEBI + - RXNORM + id: + title: Related Concepts - Id + type: string + description: | + The id locating the individual concept within the source of the given field. + The listed examples could both be attached to any variable related to, for example, heroin use. + + > :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ + examples: + - "27808" + - "3304" diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index 3d4ca5f..b4f8e67 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -234,9 +234,11 @@ properties: examples: - ["Not required","NOT REQUIRED"] - ["No"] - standardsMappings: - $ref: "#/_definitions/fieldStandardsMappingsItem" custom: type: object description: | - Additional properties not included a core property. \ No newline at end of file + Additional properties not included a core property. + relatedConcepts: + $ref: "#/_definitions/relatedConcepts" + standardsMappings: + $ref: "#/_definitions/fieldStandardsMappingsItem" \ No newline at end of file From f9dc40107875eda8457a07b8f6b21dd06c9f9f27 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 23 Jan 2024 14:04:42 -0600 Subject: [PATCH 70/72] Update build --- .../jsonschema-csvtemplate-fields.html | 10 ++- ...onschema-jsontemplate-data-dictionary.html | 12 ++- .../jsonschema-csvtemplate-fields.md | 83 ++++++++++++++++++- ...jsonschema-jsontemplate-data-dictionary.md | 12 ++- .../frictionless/csvtemplate/fields.json | 70 ++++++++++++---- .../jsonschema/csvtemplate/fields.json | 47 +++++++++-- .../schemas/jsonschema/data-dictionary.json | 55 +++++++++++- .../templates/template_submission.csv | 2 +- .../templates/template_submission.json | 12 ++- 9 files changed, 267 insertions(+), 36 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index cbc1eb4..a86efcf 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -17,6 +17,14 @@
"required"
 

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Not required|NOT REQUIRED"
 
"No"
+

Type: string

Additional properties not included a core property.

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$

Type: stringFormat: uri

The url that links out to the published, related concept.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808"
+
"http://purl.bioontology.org/ontology/RXNORM/3304"
+

Type: string

A human-readable title (ie label) to a concept related to the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"Heroin Molecular Structure"
+
"Heroin Ontology"
+

Type: string

The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"CHEBI"
+
"RXNORM"
+

Type: string

The id locating the individual concept within the source of the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"27808"
+
"3304"
 

Type: stringFormat: uri

A url (e.g., link, address) to a file or other resource containing the instrument, or
a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).


Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
 

Type: enum (of string)

An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)
containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
 
"adult-demographics"
@@ -24,4 +32,4 @@
 

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-

Type: string

Additional properties not included a core property.

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$
\ No newline at end of file + \ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index a21bafa..2eab9c4 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -62,7 +62,15 @@
[
     "No"
 ]
-

Type: array of object

A set of instrument and item references to standardized data elements designed to document
the HEAL common data elements program
and other standardized/common element sources to facilitate cross-study comparison and interoperability
of data. One can either map an individual data element or an instrument in which the field is
a part of.

*All Fields Mapped (Both Instrument and Item)*

"standardsMappings": [
+

Type: object

Additional properties not included a core property.

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, related concept.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808"
+
"http://purl.bioontology.org/ontology/RXNORM/3304"
+

Type: string

A human-readable title (ie label) to a concept related to the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"Heroin Molecular Structure"
+
"Heroin Ontology"
+

Type: string

The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"CHEBI"
+
"RXNORM"
+

Type: string

The id locating the individual concept within the source of the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"27808"
+
"3304"
+

Type: array of object

A set of instrument and item references to standardized data elements designed to document
the HEAL common data elements program
and other standardized/common element sources to facilitate cross-study comparison and interoperability
of data. One can either map an individual data element or an instrument in which the field is
a part of.

*All Fields Mapped (Both Instrument and Item)*

"standardsMappings": [
     {
         "instrument": {
             "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx",
@@ -126,4 +134,4 @@
 

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-

Type: object

Additional properties not included a core property.

\ No newline at end of file + \ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index ef18056..d5de8ff 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -252,6 +252,86 @@ Examples: ``` +**`custom`** _(string)_ + Additional properties not included a core property. + + +**`relatedConcepts[0].url`** _(string)_ + The url that links out to the published, related concept. +The listed examples could both be attached to any variable related to, for example, heroin use. + +> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ + +Examples: + + +``` + https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808 + +``` + +``` + http://purl.bioontology.org/ontology/RXNORM/3304 + +``` + +**`relatedConcepts[0].title`** _(string)_ + A human-readable title (ie label) to a concept related to the given field. +The listed examples could both be attached to any variable related to, for example, heroin use. + +> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ + +Examples: + + +``` + Heroin Molecular Structure + +``` + +``` + Heroin Ontology + +``` + +**`relatedConcepts[0].source`** _(string)_ + The source (e.g., a dictionary or vocabulary set) to a concept related to the given field. +The listed examples could both be attached to any variable related to, for example, heroin use. + +> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ + +Examples: + + +``` + CHEBI + +``` + +``` + RXNORM + +``` + +**`relatedConcepts[0].id`** _(string)_ + The id locating the individual concept within the source of the given field. +The listed examples could both be attached to any variable related to, for example, heroin use. + +> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ + +Examples: + + +``` + 27808 + +``` + +``` + 3304 + +``` + **`standardsMappings[0].instrument.url`** _(string)_ A url (e.g., link, address) to a file or other resource containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) @@ -335,9 +415,6 @@ Examples: ``` -**`custom`** _(string)_ - Additional properties not included a core property. - ## End of schema - Additional Property information diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index 1df660a..2107609 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -300,6 +300,15 @@ Examples: ``` +**`custom`** _(object)_ + Additional properties not included a core property. + + +**`relatedConcepts`** _(array)_ + __**[Under development]**__ Mappings to a published set of concepts related to the given field such as +ontological information (eg., NCI thesaurus, bioportal etc) + + **`standardsMappings`** _(array)_ A set of instrument and item references to standardized data elements designed to document @@ -397,9 +406,6 @@ Two separate records. If desired, multiple standard mappings can be entered, say ``` -**`custom`** _(object)_ - Additional properties not included a core property. - ### Additional `fields` property information #### `type` enum definitions: diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 32acae9..205ddd9 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -67,17 +67,17 @@ "type": "string", "constraints": { "enum": [ + "duration", "geopoint", - "string", - "boolean", + "yearmonth", + "datetime", "integer", - "time", - "date", + "string", "year", "any", - "yearmonth", - "duration", - "datetime", + "time", + "date", + "boolean", "number" ] } @@ -189,6 +189,54 @@ "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" } }, + { + "name": "custom", + "description": "Additional properties not included a core property. \n", + "type": "string", + "constraints": { + "pattern": "^(?:.*?=.*?(?:\\||$))+$" + } + }, + { + "name": "relatedConcepts[0].url", + "description": "The url that links out to the published, related concept. \nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "title": "Related Concepts - Url", + "examples": [ + "https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808", + "http://purl.bioontology.org/ontology/RXNORM/3304" + ], + "type": "string" + }, + { + "name": "relatedConcepts[0].title", + "description": "A human-readable title (ie label) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "title": "Related concepts - Type", + "examples": [ + "Heroin Molecular Structure", + "Heroin Ontology" + ], + "type": "string" + }, + { + "name": "relatedConcepts[0].source", + "description": "The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "title": "Related Concepts - Source", + "examples": [ + "CHEBI", + "RXNORM" + ], + "type": "string" + }, + { + "name": "relatedConcepts[0].id", + "description": "The id locating the individual concept within the source of the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "title": "Related Concepts - Id", + "examples": [ + "27808", + "3304" + ], + "type": "string" + }, { "name": "standardsMappings[0].instrument.url", "description": "A url (e.g., link, address) to a file or other resource containing the instrument, or\na set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n", @@ -253,14 +301,6 @@ "C74457" ], "type": "string" - }, - { - "name": "custom", - "description": "Additional properties not included a core property. ", - "type": "string", - "constraints": { - "pattern": "^(?:.*?=.*?(?:\\||$))+$" - } } ], "missingValues": [ diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index ab6ee2a..b974042 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -158,6 +158,48 @@ ], "pattern": "^(?:[^|]+\\||[^|]*)(?:[^|]*\\|)*[^|]*$" }, + "custom": { + "type": "string", + "description": "Additional properties not included a core property. \n", + "pattern": "^(?:.*?=.*?(?:\\||$))+$" + }, + "relatedConcepts[0].url": { + "title": "Related Concepts - Url", + "description": "The url that links out to the published, related concept. \nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "type": "string", + "format": "uri", + "examples": [ + "https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808", + "http://purl.bioontology.org/ontology/RXNORM/3304" + ] + }, + "relatedConcepts[0].title": { + "title": "Related concepts - Type", + "description": "A human-readable title (ie label) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "type": "string", + "examples": [ + "Heroin Molecular Structure", + "Heroin Ontology" + ] + }, + "relatedConcepts[0].source": { + "title": "Related Concepts - Source", + "description": "The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "type": "string", + "examples": [ + "CHEBI", + "RXNORM" + ] + }, + "relatedConcepts[0].id": { + "title": "Related Concepts - Id", + "type": "string", + "description": "The id locating the individual concept within the source of the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "examples": [ + "27808", + "3304" + ] + }, "standardsMappings[0].instrument.url": { "title": "Url", "description": "A url (e.g., link, address) to a file or other resource containing the instrument, or\na set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n", @@ -215,11 +257,6 @@ "examples": [ "C74457" ] - }, - "custom": { - "type": "string", - "description": "Additional properties not included a core property. ", - "pattern": "^(?:.*?=.*?(?:\\||$))+$" } } } \ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 89a8748..044c5c2 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -282,6 +282,57 @@ ] ] }, + "custom": { + "type": "object", + "description": "Additional properties not included a core property. \n" + }, + "relatedConcepts": { + "title": "Related Concepts", + "description": "__**[Under development]**__ Mappings to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n", + "type": "array", + "items": { + "type": "object", + "properties": { + "url": { + "title": "Related Concepts - Url", + "description": "The url that links out to the published, related concept. \nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "type": "string", + "format": "uri", + "examples": [ + "https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808", + "http://purl.bioontology.org/ontology/RXNORM/3304" + ] + }, + "title": { + "title": "Related concepts - Type", + "description": "A human-readable title (ie label) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "type": "string", + "examples": [ + "Heroin Molecular Structure", + "Heroin Ontology" + ] + }, + "source": { + "title": "Related Concepts - Source", + "description": "The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "type": "string", + "examples": [ + "CHEBI", + "RXNORM" + ] + }, + "id": { + "title": "Related Concepts - Id", + "type": "string", + "description": "The id locating the individual concept within the source of the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "examples": [ + "27808", + "3304" + ] + } + } + } + }, "standardsMappings": { "type": "array", "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \"5141\"\n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"Adult demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \"5141\"\n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"Adult demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", @@ -362,10 +413,6 @@ } } } - }, - "custom": { - "type": "object", - "description": "Additional properties not included a core property. " } } } diff --git a/variable-level-metadata-schema/templates/template_submission.csv b/variable-level-metadata-schema/templates/template_submission.csv index 3d0b94d..72ed190 100644 --- a/variable-level-metadata-schema/templates/template_submission.csv +++ b/variable-level-metadata-schema/templates/template_submission.csv @@ -1 +1 @@ -schemaVersion,section,name,title,description,type,format,constraints.required,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id,custom \ No newline at end of file +schemaVersion,section,name,title,description,type,format,constraints.required,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,custom,relatedConcepts[0].url,relatedConcepts[0].title,relatedConcepts[0].source,relatedConcepts[0].id,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id \ No newline at end of file diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index 5a7c0c6..3dff95a 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -37,6 +37,15 @@ "missingValues": [], "trueValues": [], "falseValues": [], + "custom": {}, + "relatedConcepts": [ + { + "url": null, + "title": null, + "source": null, + "id": null + } + ], "standardsMappings": [ { "instrument": { @@ -51,8 +60,7 @@ "id": null } } - ], - "custom": {} + ] } ] } From 7630f342d1c9db21dba7c64391ab307c29a27f12 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 23 Jan 2024 14:14:41 -0600 Subject: [PATCH 71/72] minor updates to README --- variable-level-metadata-schema/README.md | 18 +++++++----------- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/variable-level-metadata-schema/README.md b/variable-level-metadata-schema/README.md index acd7474..1dc264c 100644 --- a/variable-level-metadata-schema/README.md +++ b/variable-level-metadata-schema/README.md @@ -73,7 +73,8 @@ in a markdown format and an interactive html format. - `schemas/dictionary`: the yaml files used to generate json schemas and documentation with build.py. - `templates`: empty templates in csv spreadsheet format and JSON format. - `examples`: exapmles of filled out templates in csv spreadsheet format and JSON format. -- `build.py`: This script compiles the yaml files and generates associated jsonschemas and frictionless schemas in addition to the human rendered schemas +- `build.py`: This script compiles the yaml files and generates associated schemas in addition to the human rendered schema + documentation. ## Contributing @@ -87,7 +88,7 @@ To contribute to the variable level metadata specification (and annotations/exam ❗ Please read the below conventions and principles before contributing and review the existing `dictionary` directory. -## Conventions, principles, and rules +## Conventions, principles, and rules for annotations and csv <> json translation ### Annotation/documentation properties 1. `description`: SHOULD be created as markdown syntax without any headers as headers are applied in the templates. @@ -167,9 +168,6 @@ To facilitate the mapping of json spec property names to csv property names, th 1. Currently, no complex types (`anyOf`,`oneOf`) are supported and the `type` MUST be specified. This is to ensure coverage for all csv to json translation use cases. - Each json specification schema property type must be a scalar (e.g., `boolean`,`string`,`integer`,`number`), an `array`, or an `object` - Each csv specification schema property type must be a scalar (e.g., `boolean`,`string`,`integer`,`number`) -2. `enum` restrictions - - following from (1), an `enum` must only contain values of the same type - - (at least currently) MUST contain only types supported by csv fields which include scalar types (e.g., `boolean`,`string`,`integer`,`number`) in addition to type `object` as this has a stringified representation (see above). ### csv to json and json to csv translations @@ -192,14 +190,9 @@ All root level properties will be applied to individual fields IF this same fiel ### csv and json vlmd document file naming -File names for json and csv translations of a vlmd document SHOULD +File names for json and csv translations of a vlmd document are suggested to have the same stem name with corresponding "csv" and "json" suffixes (eg `my-heal-dd.csv` and `my-heal-dd.json`) -## Considerations - -Please use github issues for any additional considerations. See additional comments above. - - ## Additional table-level (root) and field-level properties Some table-level or field-level properties in other standards (or custom properties in specific use cases) do not map onto @@ -214,3 +207,6 @@ a core HEAL property. To allow these properties to be included, we list these pr One consideration, however, is that `propertyNames` was introduced in json schema draft-6. +## Considerations + +Please use github issues for any additional considerations. See additional comments above. \ No newline at end of file From 908517ff19a8c622fcaba0fcafb7051054c39611 Mon Sep 17 00:00:00 2001 From: Michael Kranz Date: Tue, 23 Jan 2024 14:20:37 -0600 Subject: [PATCH 72/72] Update build --- .../jsonschema-csvtemplate-fields.html | 18 +-- ...onschema-jsontemplate-data-dictionary.html | 20 +-- .../jsonschema-csvtemplate-fields.md | 138 +++++++++--------- ...jsonschema-jsontemplate-data-dictionary.md | 10 +- .../schemas/dictionary/fields.yaml | 6 +- .../frictionless/csvtemplate/fields.json | 94 ++++++------ .../jsonschema/csvtemplate/fields.json | 74 +++++----- .../schemas/jsonschema/data-dictionary.json | 94 ++++++------ .../templates/template_submission.csv | 2 +- .../templates/template_submission.json | 16 +- 10 files changed, 236 insertions(+), 236 deletions(-) diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html index a86efcf..6b62ce8 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-csvtemplate-fields.html @@ -17,7 +17,14 @@
"required"
 

Type: string

For boolean (false) variable (as defined in type field), this field allows
a physical string representation to be cast as false (increasing
readability of the field) that is not a standard false value. It can include one or more values.

Must match regular expression: ^(?:[^|]+\||[^|]*)(?:[^|]*\|)*[^|]*$
Examples:

"Not required|NOT REQUIRED"
 
"No"
-

Type: string

Additional properties not included a core property.

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$

Type: stringFormat: uri

The url that links out to the published, related concept.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808"
+

Type: string

Additional properties not included a core property.

Must match regular expression: ^(?:.*?=.*?(?:\||$))+$

Type: stringFormat: uri

A url (e.g., link, address) to a file or other resource containing the instrument, or
a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).


Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
+

Type: enum (of string)

An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)
containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
+
"adult-demographics"
+

Type: string

A code or other string that identifies the instrument within the source.
This should always be from the source's formal, standardized identification system


Example:

"5141"
+

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
+

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
+

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
+

Type: stringFormat: uri

The url that links out to the published, related concept.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808"
 
"http://purl.bioontology.org/ontology/RXNORM/3304"
 

Type: string

A human-readable title (ie label) to a concept related to the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"Heroin Molecular Structure"
 
"Heroin Ontology"
@@ -25,11 +32,4 @@
 
"RXNORM"
 

Type: string

The id locating the individual concept within the source of the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"27808"
 
"3304"
-

Type: stringFormat: uri

A url (e.g., link, address) to a file or other resource containing the instrument, or
a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).


Example:

"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx"
-

Type: enum (of string)

An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository)
containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level)
or the individual variable (if at the field level).

Must be one of:

  • "heal-cde"

Type: string

Examples:

"Adult demographics"
-
"adult-demographics"
-

Type: string

A code or other string that identifies the instrument within the source.
This should always be from the source's formal, standardized identification system


Example:

"5141"
-

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
-

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
-

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-
\ No newline at end of file + \ No newline at end of file diff --git a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html index 2eab9c4..e159954 100644 --- a/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html +++ b/variable-level-metadata-schema/docs/html-rendered-schemas/jsonschema-jsontemplate-data-dictionary.html @@ -62,15 +62,7 @@
[
     "No"
 ]
-

Type: object

Additional properties not included a core property.

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, related concept.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808"
-
"http://purl.bioontology.org/ontology/RXNORM/3304"
-

Type: string

A human-readable title (ie label) to a concept related to the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"Heroin Molecular Structure"
-
"Heroin Ontology"
-

Type: string

The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"CHEBI"
-
"RXNORM"
-

Type: string

The id locating the individual concept within the source of the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"27808"
-
"3304"
-

Type: array of object

A set of instrument and item references to standardized data elements designed to document
the HEAL common data elements program
and other standardized/common element sources to facilitate cross-study comparison and interoperability
of data. One can either map an individual data element or an instrument in which the field is
a part of.

*All Fields Mapped (Both Instrument and Item)*

"standardsMappings": [
+

Type: object

Additional properties not included a core property.

Type: array of object

A set of instrument and item references to standardized data elements designed to document
the HEAL common data elements program
and other standardized/common element sources to facilitate cross-study comparison and interoperability
of data. One can either map an individual data element or an instrument in which the field is
a part of.

*All Fields Mapped (Both Instrument and Item)*

"standardsMappings": [
     {
         "instrument": {
             "url": "https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx",
@@ -134,4 +126,12 @@
 

Type: object

A standardized item (ie field, variable etc) mapped to this individual variable.

Type: stringFormat: uri

The url that links out to the published, standardized mapping of a variable (e.g., common data element)


Example:

"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE"
 

Type: string

The source of the standardized variable. Note, this property is required if
an id is specified.


Example:

"CDISC"
 

Type: string

The id locating the individual mapping within the given source.
Note, the standardsMappings[0].source property is required if
this property is specified.


Example:

"C74457"
-
\ No newline at end of file +

Type: array of object

*[Under development]* Mappings to a published set of concepts related to the given field such as
ontological information (eg., NCI thesaurus, bioportal etc)

Each item of this array must be:

Type: object

Type: stringFormat: uri

The url that links out to the published, related concept.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808"
+
"http://purl.bioontology.org/ontology/RXNORM/3304"
+

Type: string

A human-readable title (ie label) to a concept related to the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"Heroin Molecular Structure"
+
"Heroin Ontology"
+

Type: string

The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"CHEBI"
+
"RXNORM"
+

Type: string

The id locating the individual concept within the source of the given field.
The listed examples could both be attached to any variable related to, for example, heroin use.

:pointup: if you are looking for mapping field values to common data elements or a set of standards, see standardsMappings


Examples:

"27808"
+
"3304"
+
\ No newline at end of file diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md index d5de8ff..b1e871e 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-csvtemplate-fields.md @@ -256,162 +256,162 @@ Examples: Additional properties not included a core property. -**`relatedConcepts[0].url`** _(string)_ - The url that links out to the published, related concept. -The listed examples could both be attached to any variable related to, for example, heroin use. - -> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ +**`standardsMappings[0].instrument.url`** _(string)_ + A url (e.g., link, address) to a file or other resource containing the instrument, or +a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) +or the individual variable (if at the field level). Examples: ``` - https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808 - -``` - -``` - http://purl.bioontology.org/ontology/RXNORM/3304 + https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx ``` -**`relatedConcepts[0].title`** _(string)_ - A human-readable title (ie label) to a concept related to the given field. -The listed examples could both be attached to any variable related to, for example, heroin use. +**`standardsMappings[0].instrument.source`** _(string)_ + An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository) +containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) +or the individual variable (if at the field level). -> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ +Must be one of: `heal-cde` +**`standardsMappings[0].instrument.title`** _(string)_ + Examples: ``` - Heroin Molecular Structure + Adult demographics ``` ``` - Heroin Ontology + adult-demographics ``` -**`relatedConcepts[0].source`** _(string)_ - The source (e.g., a dictionary or vocabulary set) to a concept related to the given field. -The listed examples could both be attached to any variable related to, for example, heroin use. - -> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ +**`standardsMappings[0].instrument.id`** _(string)_ + A code or other string that identifies the instrument within the source. +This should always be from the source's formal, standardized identification system Examples: ``` - CHEBI - -``` - -``` - RXNORM + 5141 ``` -**`relatedConcepts[0].id`** _(string)_ - The id locating the individual concept within the source of the given field. -The listed examples could both be attached to any variable related to, for example, heroin use. - -> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ +**`standardsMappings[0].item.url`** _(string)_ + The url that links out to the published, standardized mapping of a variable (e.g., common data element) Examples: ``` - 27808 + https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE ``` +**`standardsMappings[0].item.source`** _(string)_ + The source of the standardized variable. Note, this property is required if +an id is specified. + +Examples: + + ``` - 3304 + CDISC ``` -**`standardsMappings[0].instrument.url`** _(string)_ - A url (e.g., link, address) to a file or other resource containing the instrument, or -a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) -or the individual variable (if at the field level). +**`standardsMappings[0].item.id`** _(string)_ + The id locating the individual mapping within the given source. +Note, the `standardsMappings[0].source` property is required if +this property is specified. Examples: ``` - https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx + C74457 ``` -**`standardsMappings[0].instrument.source`** _(string)_ - An abbreviated name/acronym from a controlled vocabulary referencing the resource (e.g., program or repository) -containing the instrument, or a set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) -or the individual variable (if at the field level). +**`relatedConcepts[0].url`** _(string)_ + The url that links out to the published, related concept. +The listed examples could both be attached to any variable related to, for example, heroin use. -Must be one of: `heal-cde` +> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ -**`standardsMappings[0].instrument.title`** _(string)_ - Examples: ``` - Adult demographics + https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808 ``` ``` - adult-demographics + http://purl.bioontology.org/ontology/RXNORM/3304 ``` -**`standardsMappings[0].instrument.id`** _(string)_ - A code or other string that identifies the instrument within the source. -This should always be from the source's formal, standardized identification system +**`relatedConcepts[0].title`** _(string)_ + A human-readable title (ie label) to a concept related to the given field. +The listed examples could both be attached to any variable related to, for example, heroin use. + +> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ Examples: ``` - 5141 + Heroin Molecular Structure ``` -**`standardsMappings[0].item.url`** _(string)_ - The url that links out to the published, standardized mapping of a variable (e.g., common data element) - -Examples: - - ``` - https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE + Heroin Ontology ``` -**`standardsMappings[0].item.source`** _(string)_ - The source of the standardized variable. Note, this property is required if -an id is specified. +**`relatedConcepts[0].source`** _(string)_ + The source (e.g., a dictionary or vocabulary set) to a concept related to the given field. +The listed examples could both be attached to any variable related to, for example, heroin use. + +> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ Examples: ``` - CDISC + CHEBI ``` -**`standardsMappings[0].item.id`** _(string)_ - The id locating the individual mapping within the given source. -Note, the `standardsMappings[0].source` property is required if -this property is specified. +``` + RXNORM + +``` + +**`relatedConcepts[0].id`** _(string)_ + The id locating the individual concept within the source of the given field. +The listed examples could both be attached to any variable related to, for example, heroin use. + +> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_ Examples: ``` - C74457 + 27808 + +``` + +``` + 3304 ``` diff --git a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md index 2107609..c569d2c 100644 --- a/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md +++ b/variable-level-metadata-schema/docs/md-rendered-schemas/jsonschema-jsontemplate-data-dictionary.md @@ -304,11 +304,6 @@ Examples: Additional properties not included a core property. -**`relatedConcepts`** _(array)_ - __**[Under development]**__ Mappings to a published set of concepts related to the given field such as -ontological information (eg., NCI thesaurus, bioportal etc) - - **`standardsMappings`** _(array)_ A set of instrument and item references to standardized data elements designed to document @@ -406,6 +401,11 @@ Two separate records. If desired, multiple standard mappings can be entered, say ``` +**`relatedConcepts`** _(array)_ + __**[Under development]**__ Mappings to a published set of concepts related to the given field such as +ontological information (eg., NCI thesaurus, bioportal etc) + + ### Additional `fields` property information #### `type` enum definitions: diff --git a/variable-level-metadata-schema/schemas/dictionary/fields.yaml b/variable-level-metadata-schema/schemas/dictionary/fields.yaml index b4f8e67..d0c07e9 100644 --- a/variable-level-metadata-schema/schemas/dictionary/fields.yaml +++ b/variable-level-metadata-schema/schemas/dictionary/fields.yaml @@ -238,7 +238,7 @@ properties: type: object description: | Additional properties not included a core property. - relatedConcepts: - $ref: "#/_definitions/relatedConcepts" standardsMappings: - $ref: "#/_definitions/fieldStandardsMappingsItem" \ No newline at end of file + $ref: "#/_definitions/fieldStandardsMappingsItem" + relatedConcepts: + $ref: "#/_definitions/relatedConcepts" \ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json index 205ddd9..50a83e3 100644 --- a/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/frictionless/csvtemplate/fields.json @@ -67,18 +67,18 @@ "type": "string", "constraints": { "enum": [ - "duration", + "time", + "number", "geopoint", + "any", "yearmonth", - "datetime", - "integer", - "string", "year", - "any", - "time", + "datetime", "date", + "integer", + "duration", "boolean", - "number" + "string" ] } }, @@ -197,46 +197,6 @@ "pattern": "^(?:.*?=.*?(?:\\||$))+$" } }, - { - "name": "relatedConcepts[0].url", - "description": "The url that links out to the published, related concept. \nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", - "title": "Related Concepts - Url", - "examples": [ - "https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808", - "http://purl.bioontology.org/ontology/RXNORM/3304" - ], - "type": "string" - }, - { - "name": "relatedConcepts[0].title", - "description": "A human-readable title (ie label) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", - "title": "Related concepts - Type", - "examples": [ - "Heroin Molecular Structure", - "Heroin Ontology" - ], - "type": "string" - }, - { - "name": "relatedConcepts[0].source", - "description": "The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", - "title": "Related Concepts - Source", - "examples": [ - "CHEBI", - "RXNORM" - ], - "type": "string" - }, - { - "name": "relatedConcepts[0].id", - "description": "The id locating the individual concept within the source of the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", - "title": "Related Concepts - Id", - "examples": [ - "27808", - "3304" - ], - "type": "string" - }, { "name": "standardsMappings[0].instrument.url", "description": "A url (e.g., link, address) to a file or other resource containing the instrument, or\na set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n", @@ -301,6 +261,46 @@ "C74457" ], "type": "string" + }, + { + "name": "relatedConcepts[0].url", + "description": "The url that links out to the published, related concept. \nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "title": "Related Concepts - Url", + "examples": [ + "https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808", + "http://purl.bioontology.org/ontology/RXNORM/3304" + ], + "type": "string" + }, + { + "name": "relatedConcepts[0].title", + "description": "A human-readable title (ie label) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "title": "Related concepts - Type", + "examples": [ + "Heroin Molecular Structure", + "Heroin Ontology" + ], + "type": "string" + }, + { + "name": "relatedConcepts[0].source", + "description": "The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "title": "Related Concepts - Source", + "examples": [ + "CHEBI", + "RXNORM" + ], + "type": "string" + }, + { + "name": "relatedConcepts[0].id", + "description": "The id locating the individual concept within the source of the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "title": "Related Concepts - Id", + "examples": [ + "27808", + "3304" + ], + "type": "string" } ], "missingValues": [ diff --git a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json index b974042..325a9c3 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json +++ b/variable-level-metadata-schema/schemas/jsonschema/csvtemplate/fields.json @@ -163,43 +163,6 @@ "description": "Additional properties not included a core property. \n", "pattern": "^(?:.*?=.*?(?:\\||$))+$" }, - "relatedConcepts[0].url": { - "title": "Related Concepts - Url", - "description": "The url that links out to the published, related concept. \nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", - "type": "string", - "format": "uri", - "examples": [ - "https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808", - "http://purl.bioontology.org/ontology/RXNORM/3304" - ] - }, - "relatedConcepts[0].title": { - "title": "Related concepts - Type", - "description": "A human-readable title (ie label) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", - "type": "string", - "examples": [ - "Heroin Molecular Structure", - "Heroin Ontology" - ] - }, - "relatedConcepts[0].source": { - "title": "Related Concepts - Source", - "description": "The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", - "type": "string", - "examples": [ - "CHEBI", - "RXNORM" - ] - }, - "relatedConcepts[0].id": { - "title": "Related Concepts - Id", - "type": "string", - "description": "The id locating the individual concept within the source of the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", - "examples": [ - "27808", - "3304" - ] - }, "standardsMappings[0].instrument.url": { "title": "Url", "description": "A url (e.g., link, address) to a file or other resource containing the instrument, or\na set of items which encompass a variable in this variable level metadata document (if at the root level or the document level) \nor the individual variable (if at the field level). \n", @@ -257,6 +220,43 @@ "examples": [ "C74457" ] + }, + "relatedConcepts[0].url": { + "title": "Related Concepts - Url", + "description": "The url that links out to the published, related concept. \nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "type": "string", + "format": "uri", + "examples": [ + "https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808", + "http://purl.bioontology.org/ontology/RXNORM/3304" + ] + }, + "relatedConcepts[0].title": { + "title": "Related concepts - Type", + "description": "A human-readable title (ie label) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "type": "string", + "examples": [ + "Heroin Molecular Structure", + "Heroin Ontology" + ] + }, + "relatedConcepts[0].source": { + "title": "Related Concepts - Source", + "description": "The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "type": "string", + "examples": [ + "CHEBI", + "RXNORM" + ] + }, + "relatedConcepts[0].id": { + "title": "Related Concepts - Id", + "type": "string", + "description": "The id locating the individual concept within the source of the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "examples": [ + "27808", + "3304" + ] } } } \ No newline at end of file diff --git a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json index 044c5c2..05c88d5 100644 --- a/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json +++ b/variable-level-metadata-schema/schemas/jsonschema/data-dictionary.json @@ -286,53 +286,6 @@ "type": "object", "description": "Additional properties not included a core property. \n" }, - "relatedConcepts": { - "title": "Related Concepts", - "description": "__**[Under development]**__ Mappings to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n", - "type": "array", - "items": { - "type": "object", - "properties": { - "url": { - "title": "Related Concepts - Url", - "description": "The url that links out to the published, related concept. \nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", - "type": "string", - "format": "uri", - "examples": [ - "https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808", - "http://purl.bioontology.org/ontology/RXNORM/3304" - ] - }, - "title": { - "title": "Related concepts - Type", - "description": "A human-readable title (ie label) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", - "type": "string", - "examples": [ - "Heroin Molecular Structure", - "Heroin Ontology" - ] - }, - "source": { - "title": "Related Concepts - Source", - "description": "The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", - "type": "string", - "examples": [ - "CHEBI", - "RXNORM" - ] - }, - "id": { - "title": "Related Concepts - Id", - "type": "string", - "description": "The id locating the individual concept within the source of the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", - "examples": [ - "27808", - "3304" - ] - } - } - } - }, "standardsMappings": { "type": "array", "description": "\nA set of instrument and item references to standardized data elements designed to document\nthe [HEAL common data elements program](https://heal.nih.gov/data/common-data-elements)\nand other standardized/common element sources to facilitate cross-study comparison and interoperability\nof data. One can either map an individual data element or an instrument in which the field is \na part of.\n\n__**All Fields Mapped (Both Instrument and Item)**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"url\": \"https://www.heal.nih.gov/files/CDEs/2023-05/adult-demographics-cdes.xlsx\",\n \"source\": \"heal-cde\",\n \"title\": \"adult-demographics\",\n \"id\": \"5141\"\n },\n \"item\": {\n \"url\": \"https://evs.nci.nih.gov/ftp1/CDISC/SDTM/SDTM%20Terminology.html#CL.C74457.RACE\",\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n }\n }\n]\n```\n\n__**Only Instrument Title of Form CDE File Mapped**__\n\nIn this scenario, especially as CDE variables do not have associated CDISC ids listed, only instrument information is given.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"Adult demographics\"\n }\n }\n]\n```\n\n__**Only Instrument ID of HEAL CDE Mapped**__\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"id\": \"5141\"\n }\n }\n]\n```\n\n__**Other Non-HEAL CDE Use Cases**__\n\nOnly item matched (for example if found in the NIH (not HEAL) CDE repository). Folks would enter the information in the \"Identifier\" section. Similar to the above, they could also just enter the \"url\".\n\n```json\n\"standardsMappings\": [\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n\n__**Multiple CDE Mappings**__\n\nTwo separate records. If desired, multiple standard mappings can be entered, say from the NIH HEAL CDE repo and the NIH CDE lookup (NLM) by way of two separate records in the list.\n\n```json\n\"standardsMappings\": [\n {\n \"instrument\": {\n \"source\": \"heal-cde\",\n \"title\": \"Adult demographics\"\n },\n \"item\": {\n \"source\": \"CDISC\",\n \"id\": \"C74457\"\n },\n },\n {\n \"item\": {\n \"source\": \"NLM\",\n \"id\": \"Fakc6Jy2x\"\n }\n }\n]\n```\n", @@ -413,6 +366,53 @@ } } } + }, + "relatedConcepts": { + "title": "Related Concepts", + "description": "__**[Under development]**__ Mappings to a published set of concepts related to the given field such as \nontological information (eg., NCI thesaurus, bioportal etc)\n", + "type": "array", + "items": { + "type": "object", + "properties": { + "url": { + "title": "Related Concepts - Url", + "description": "The url that links out to the published, related concept. \nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "type": "string", + "format": "uri", + "examples": [ + "https://www.ebi.ac.uk/chebi/chebiOntology.do?chebiId=CHEBI:27808", + "http://purl.bioontology.org/ontology/RXNORM/3304" + ] + }, + "title": { + "title": "Related concepts - Type", + "description": "A human-readable title (ie label) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "type": "string", + "examples": [ + "Heroin Molecular Structure", + "Heroin Ontology" + ] + }, + "source": { + "title": "Related Concepts - Source", + "description": "The source (e.g., a dictionary or vocabulary set) to a concept related to the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "type": "string", + "examples": [ + "CHEBI", + "RXNORM" + ] + }, + "id": { + "title": "Related Concepts - Id", + "type": "string", + "description": "The id locating the individual concept within the source of the given field.\nThe listed examples could both be attached to any variable related to, for example, heroin use.\n\n> :point_up: if you are looking for mapping field values to common data elements or a set of standards, see `standardsMappings`_\n", + "examples": [ + "27808", + "3304" + ] + } + } + } } } } diff --git a/variable-level-metadata-schema/templates/template_submission.csv b/variable-level-metadata-schema/templates/template_submission.csv index 72ed190..1e629e3 100644 --- a/variable-level-metadata-schema/templates/template_submission.csv +++ b/variable-level-metadata-schema/templates/template_submission.csv @@ -1 +1 @@ -schemaVersion,section,name,title,description,type,format,constraints.required,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,custom,relatedConcepts[0].url,relatedConcepts[0].title,relatedConcepts[0].source,relatedConcepts[0].id,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id \ No newline at end of file +schemaVersion,section,name,title,description,type,format,constraints.required,constraints.maxLength,constraints.enum,constraints.pattern,constraints.maximum,constraints.minimum,enumLabels,enumOrdered,missingValues,trueValues,falseValues,custom,standardsMappings[0].instrument.url,standardsMappings[0].instrument.source,standardsMappings[0].instrument.title,standardsMappings[0].instrument.id,standardsMappings[0].item.url,standardsMappings[0].item.source,standardsMappings[0].item.id,relatedConcepts[0].url,relatedConcepts[0].title,relatedConcepts[0].source,relatedConcepts[0].id \ No newline at end of file diff --git a/variable-level-metadata-schema/templates/template_submission.json b/variable-level-metadata-schema/templates/template_submission.json index 3dff95a..c8d2524 100644 --- a/variable-level-metadata-schema/templates/template_submission.json +++ b/variable-level-metadata-schema/templates/template_submission.json @@ -38,14 +38,6 @@ "trueValues": [], "falseValues": [], "custom": {}, - "relatedConcepts": [ - { - "url": null, - "title": null, - "source": null, - "id": null - } - ], "standardsMappings": [ { "instrument": { @@ -60,6 +52,14 @@ "id": null } } + ], + "relatedConcepts": [ + { + "url": null, + "title": null, + "source": null, + "id": null + } ] } ]