Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.2.0 revisions #50

Merged
merged 72 commits into from
Jan 24, 2024
Merged

v0.2.0 revisions #50

merged 72 commits into from
Jan 24, 2024

Conversation

mbkranz
Copy link
Collaborator

@mbkranz mbkranz commented Dec 4, 2023

This deployment contains the revisions marked for v0.2.0 in the HEAL issues here

@mbkranz
Copy link
Collaborator Author

mbkranz commented Dec 8, 2023

See below for open questions and further details on some of the decisions in this version (and future TODOs):

  1. For discussion on flattened names in csv see Add instrument (e.g., CRF form) to VLMD schema  #39 . This provides the flexibilty to add more items (eg standardsMappings[1].item.url).

  2. It also allows a simplified csv to json conversion heuristic: after flattening with above, will either be a single value (ie string), a "=" separator for a mapped object, and "|" for a list (array). Vice versa: the csv variable name contains all the instructions needed to convert to a json file.

  3. Add examples with standardsMappings for csv -- but see issue Add instrument (e.g., CRF form) to VLMD schema  #39

  4. Add documentation on implementation of csv to json conversion (ie the above) - while a migrated to using custom jinja templating, this may be best housed in the implementations (ie best-practices and healdata-utils). Here, we could use a more conventional (but perhaps a little less readable lib).

  5. Better json/csv markdown documentation rendering -- may want to make the fields properties a separate file and reference in a root data dictionary file.

  6. Given the variable name is a pattern (ie fieldvarname[\d+].varname) for flattened fields (which json schema supports). I may need to add this into the schema. Additionally, I would suggest we remove the frictionless schema and rely on the json schemas. This will reduce confusion in addition to a few other pain points in implementation not mentioned here.

    • For example: if an investigator wanted to add 2 standardsMappings items: they should add fieldvarname[0].varname in one column and fieldvarname[1].varname in another column
    • NOTE the current method is to make a csv template with [0] in the csv schema which provides the functionality necessary when only one standardsMapping (ie the HEAL CDE repo)
  7. Additional note: some of the variables may be dependent on other variables. for example, if a standardsMapping[0].item.id is present, it really needs a standardsMapping[0].item.source to be understandable. I believe in a more recent draft of json schema they allow for specification of dependencies.

  8. Given the simplified/predictable patterns for flattened variable names and values -- do we want to exclude the csv schema from the standards repo and include the 2 csv transform patterns mentioned above simply in the json schema (and remove the csv altogether) ?

  9. In the previous version, we added the version of the schema by specifying {"version:"0.0.1"}. However, this may be confused with the instance version if the document and schema get bundled together. It also doesn't allow one to specify the schema version specification within the instance. I proposed we add an enum property called schemaVersion with one value (the version num). More concretely:

{"schemaVersion":{"enum":["v0.2.0"]}

Alterantively, we could rely on $schema url to resolve to a versioned vlmd schema specification within an instance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants