You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The write_datasetjson.sas macro appears not to be detecting an inconsistent dataType "integer" definition vs "decimal" or "float" in actual data when creating a json file referencing a Define-XML.
Note: it may not affect the round trip validation against the source data depending on the software used to read the json file. However, depending on the software used, it may cause data truncation.
I know that appropriate validation of inconsistencies between Define-XML and the data is expected to take place before considering data files valid; however, it would be convenient to detect and report the inconsistency in the write_datasetjson.sas macro.
Here is a visual display for the adadas.json using the Dicore Group's Dataset-JSON v1.1 Viewer, showing the AVAL definition and some decimal values:
List of all columns with the indicated inconsistent definitions in the examples released with Dataset-JSON v1.1:
Note: I'm finding the list of exceptions by using my apps to create the json files. I'm assigning "decimal" as json_datatype based on the data.
The text was updated successfully, but these errors were encountered:
mhungria
changed the title
Macro write_datasetjson.sas inconsistent dataType "integer" vs "decimal" in actual data
Macro write_datasetjson.sas inconsistent dataType "integer" definition vs "decimal" in actual data
Nov 21, 2024
mhungria
changed the title
Macro write_datasetjson.sas inconsistent dataType "integer" definition vs "decimal" in actual data
Macro write_datasetjson.sas inconsistent dataType "integer" definition vs "decimal" or "float" in actual data
Nov 21, 2024
Hi Marcelina,
I was aware of some of these issues in the ADaM Define-XML file.
As you said, it really is a metadata problem. JSON does not even have a concept of "decimal", or "float", or even "integer". There is just "number". Although JSON schema knows "integer".
I do some limited checks on the consistency between the datatypes in the metadata, and the variable types take from that datasets.
I see the value of the check, but I feel it should be a separate step from writing the JSON, or reading, since you can also not assume consistency between data and metadata when reading.
And the check should probably also include checking that for the datetime, data, and time there are no incomplete values, when reading Dataset-JSON.
Btw, based on the data, how do you make a distinction between "decimal", "float", and "double"? Some of the variables above should actually be "float" or "double".
To your question, for now, I provide an overall execution parameter for the distinction depending on the actual data and the Define-XML SignificantDigits for float definitions - however the Define-XML may be wrong as we know :). The corresponding SMEs would need to provide the input for it.
Note: the revised examples available via the DIcore Dataset-JSON Viewer were created with an overall execution parameter of >8 decimal digits to differentiate between the float numbers and string/decimal representation. One may provide a different distinction mechanism depending on the use case.
The write_datasetjson.sas macro appears not to be detecting an inconsistent dataType "integer" definition vs "decimal" or "float" in actual data when creating a json file referencing a Define-XML.
Note: it may not affect the round trip validation against the source data depending on the software used to read the json file. However, depending on the software used, it may cause data truncation.
I know that appropriate validation of inconsistencies between Define-XML and the data is expected to take place before considering data files valid; however, it would be convenient to detect and report the inconsistency in the write_datasetjson.sas macro.
Refer to examples:
https://github.com/cdisc-org/DataExchange-DatasetJson/blob/master/examples/adam/adadas.json
https://github.com/cdisc-org/DataExchange-DatasetJson/blob/master/examples/adam/adnpix.json
Here is a visual display for the adadas.json using the Dicore Group's Dataset-JSON v1.1 Viewer, showing the AVAL definition and some decimal values:
List of all columns with the indicated inconsistent definitions in the examples released with Dataset-JSON v1.1:
Note: I'm finding the list of exceptions by using my apps to create the json files. I'm assigning "decimal" as json_datatype based on the data.
The text was updated successfully, but these errors were encountered: