Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch remaining R5 Turtle RDF generation to TurtleParser implementation #148

Merged
merged 13 commits into from
Dec 17, 2024

Conversation

tmprd
Copy link
Contributor

@tmprd tmprd commented Oct 4, 2023

Fixes these issues and also missing type annotations on profiles (ex. https://www.hl7.org/fhir/actordefinition.profile.ttl.html)

This simply switches some of the example generators to the R5+ TurtleParser instead of the RdfParser used before R5.

@dbooth-boston
Copy link

@tmprd could you please update this to the latest base branch, and then let's see if we can get @grahamegrieve 's attention to merge it into main.

@markiantorno
Copy link
Collaborator

If you update to the latest so we can merge without conflicts, I will assist in getting this through the pipeline. @balhoff

@dotasek fyi

@dotasek dotasek self-requested a review February 20, 2024 16:37
Copy link
Collaborator

@dotasek dotasek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is no longer producing this file: baseFileName + ".ttl"

It does look like the parser has been switched, but was it intended to still output that file?

@tmprd
Copy link
Contributor Author

tmprd commented Feb 25, 2024

It looks like this is no longer producing this file: baseFileName + ".ttl"

It does look like the parser has been switched, but was it intended to still output that file?

Thanks for your review @dotasek Expanding the changes shows where this file is produced now: https://github.com/HL7/kindling/pull/148/files#diff-0b45ecac1a10eb5075600a7cb0da87b7531c1c2795f7d2519db5a50e1d440a42R4495-R4506

I double checked the turtle files that were generated from this, and it only outputs files for R5. Is that expected? That is what I intended. It's not clear to me how different versions of FHIR are published as part of this process, and whether its possible to update older versions from this code.

@tmprd tmprd requested a review from dotasek February 25, 2024 17:30
@dbooth-boston
Copy link

I think @ericprud figured out how versions are handled. Eric, can you answer this?

@ericprud
Copy link

ericprud commented Mar 9, 2024

I think @ericprud figured out how versions are handled. Eric, can you answer this?

If the Q is how do I handle version selection in APIs, I've been making an API call distinction between RDVch (coincident with DSTU3 and R4) and rdvCh (coincident with R5). My switches have been separate but I think the only one that it might make sense to keep separate is hoist (as an override to the 3,4-style or 5-style).

@dbooth-boston
Copy link

@dotasek , unless the R5 build process is also expected to produce R4 examples, then I believe from @tmprd 's response (above) that the R5 turtle examples you mentioned are being generated.

@tmprd
Copy link
Contributor Author

tmprd commented Apr 15, 2024

Will Publisher.java be used to republish R4 or below? This code suggests Publisher.java is used only for R4B, R5, and R6 (so far).

private String pidRoot() {
if (VersionUtilities.isR4BVer(page.getVersion().toCode())) {
return "hl7.fhir.r4b";
} else if (VersionUtilities.isR5Ver(page.getVersion().toCode())) {
return "hl7.fhir.r5";
} else {
return "hl7.fhir.r6";
}
}

If Publisher.java is to be used for R4 or below, then I can add a condition to ensure that this RDF generator is for R5 and above. If not, you can merge this. Currently, this RDF generator is used for any version other than R4B, which is logic used by serializeResource():
void serializeResource(Resource r, String baseFileName, String description, String pageType, String crumbTitle, WorkGroup wg, boolean showCanonical, boolean showTtl) throws Exception {
if (VersionUtilities.isR4BVer(page.getVersion().toCode())) {
So we are currently doing no worse than anything that uses serializeResource() or resource2Json(), but I just thought of this as an extra precaution.

@dotasek
Copy link
Collaborator

dotasek commented Apr 17, 2024

@tmprd kindling is normally only used to generate the version of FHIR currently in development (at this juncture, R6).

There was a special case of this with R5 and R4B being developed in parallel, but this was an exception.

@grahamegrieve is this correct, or am I missing some details regarding publishing existing versions of the spec?

@dotasek
Copy link
Collaborator

dotasek commented Apr 18, 2024

@tmprd

This does use the generator, but the code you're pointing to doesn't create the necessary file. To replicate this, run kindling to publish FHIR. Look in the publish folder for this document:

publish/definition.ttl.html

Open the document in a web browser. There is a link that says Raw Turtle. It is meant to link to definition.ttl, which is missing. This doesn't happen with the master branch; definition.ttl is generated.

I put a breakpoint in the master branch of kindling to verify, and this happens for 1460 files. All that needs to be done is for that returned string to be written to the appropriate file. The original code has a FileOutputStream already created in that place. You would just have to use PrintWriter or something to actually write out to it.

Copy link
Collaborator

@dotasek dotasek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure that output of resource2Ttl is output to appropriate file.

I would also rename that method to convertResourceToTtl.

@tmprd
Copy link
Contributor Author

tmprd commented Apr 19, 2024

@tmprd

This does use the generator, but the code you're pointing to doesn't create the necessary file. To replicate this, run kindling to publish FHIR. Look in the publish folder for this document:

publish/definition.ttl.html

Open the document in a web browser. There is a link that says Raw Turtle. It is meant to link to definition.ttl, which is missing. This doesn't happen with the master branch; definition.ttl is generated.

I put a breakpoint in the master branch of kindling to verify, and this happens for 1460 files. All that needs to be done is for that returned string to be written to the appropriate file. The original code has a FileOutputStream already created in that place. You would just have to use PrintWriter or something to actually write out to it.

Thank you! Actually I noticed it has to be written using compose() method, so I've made changes to fix it using that. This matches what is used elsewhere for resource examples:

ParserBase tp = Manager.makeParser(page.getWorkerContext(), FhirFormat.TURTLE);
tp.compose(exe, new FileOutputStream(Utilities.path(page.getFolders().dstDir, prefix + n + ".ttl")), OutputStyle.PRETTY, null);

This uses org.hl7.fhir.r5.elementmodel.TurtleParser instead of org.hl7.fhir.r4.formats.RdfParser.java.

@dotasek dotasek self-requested a review April 22, 2024 13:19
Copy link
Collaborator

@dotasek dotasek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes now appear to generate the expected files.

Before signing off on this, I'd like someone to review the differences between the previous ttl generation results, and the ones from this PR. I ran the publish process on both, and packaged them in the attached zip.

ttl-publish-diff.zip

@dbooth-boston
Copy link

Per the above request, I'm looking at the diffs between the old and new, to verify that they all look correct. To help, I've uploaded a diff of only the old and new fhir.ttl ontology file here:
ont-diffs.txt

I will post again here after I've finished my comparison.

@ericprud
Copy link

ericprud commented Sep 25, 2024

Fixes:

  1. missing fhir:text/fhir:div
  2. missing fhir:nodeRole [fhir:treeRoot]
  3. missing ^^xsd:anyURI on:
    a. */fhir:url/fhir:v
    b. fhir:system
    c. fhir:constraint/fhir:source
  4. missing fhir:link on fhir:baseDefinition, fhir:source
  5. remove duplicates of list elements in e.g. fhir:contact, fhir:status, fhir:date, fhir:jurisdiction, fhir:publisher...
  6. missing ^^xsd:dateTime on fhir:date
  7. missing ^^xsd:boolean on fhir:experimental, fhir:abstract, fhir:isModifier, fhir:isSummary
  8. missing ^^xsd:nonNegativeInteger on */fhir:element/fhir:min (/'max')

Issues:

  • aesthetic:
    1. adds spurious comment character after top-level predicate-object pairs (i.e. those with the resource as the subject).
  • semantic:
    1. adds xsd:anyURI datatype to some fhir:codes/fhir:v objects, but not xsd:discriminator
    2. adds xsd:anyURI datatype to fhir:extension/fhir:value/fhir:id objects,e.g. fhir:type
    3. adds spurious fhir:binding [ ] ; to codeableconcept.profile.ttl; no analog in JSON
    4. still missing ^^xsd:nonNegativeInteger on */fhir:element/fhir:max (when it's not '*' (issue?))

@tmprd
Copy link
Contributor Author

tmprd commented Sep 25, 2024

Fixes:

  1. missing fhir:text/fhir:div
  2. missing fhir:nodeRole [fhir:treeRoot]
  3. missing ^^xsd:anyURI on:
    a. */fhir:url/fhir:v
    b. fhir:system
    c. fhir:constraint/fhir:source
  4. missing fhir:link on fhir:baseDefinition, fhir:source
  5. remove duplicates of list elements in e.g. fhir:contact, fhir:status, fhir:date, fhir:jurisdiction, fhir:publisher...
  6. missing ^^xsd:dateTime on fhir:date
  7. missing ^^xsd:boolean on fhir:experimental, fhir:abstract, fhir:isModifier, fhir:isSummary
  8. missing ^^xsd:nonNegativeInteger on */fhir:element/fhir:min (/'max')

Issues:

  • aesthetic:

    1. adds spurious comment character after top-level predicate-object pairs (i.e. those with the resource as the subject).
  • semantic:

    1. addes xsd:anyURI datatype to some fhir:codes/fhir:v objects, but not xsd:discriminator
    2. addes xsd:anyURI datatype to fhir:extension/fhir:value/fhir:id objects,e.g. fhir:type
    3. still missing ^^xsd:nonNegativeInteger on */fhir:element/fhir:max (when it's not '*' (issue?))

Thanks for checking this Eric!
Just to clarify for everyone, this pull request simply switches the Turtle parser and serializer in Kindling from the old implementation to the new TurtleParser within the org.hl7.fhir.core project.
I believe all remaining issues need to be fixed there.

For example, the comment characters seem to be added in the org.hl7.fhir.utilities.turtle.Turtle.commitSection() method. I added an issue for this: w3c/hcls-fhir-rdf#150

@dbooth-boston
Copy link

@tmprd , thank you so much for your work on this, and sorry it took so long to get the results better vetted. (Thanks @ericprud for that!) I know I said I would give up on further vetting them if I had not gotten to it earlier, but I just couldn't bring myself to do that. I kept thinking that I'd find a way. And thankfully, @tmprd came to the rescue with a better vetting approach. Thanks all!

@dbooth-boston
Copy link

@dotasek I believe we have adequately checked the diffs and verified that this PR is correct to merge. Can you go ahead and merge it, or is there anything else still needed? Github says that it still has "Changes requested", but I am assuming that it is referring to your request for us to check the diffs, which is now done.

@dotasek dotasek self-requested a review December 17, 2024 01:19
Copy link
Collaborator

@dotasek dotasek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a run of this, I can see that no ttl files are missing or added. As the diff was approved, this looks good to me.

@dotasek dotasek merged commit 6e7649d into HL7:main Dec 17, 2024
5 checks passed
@dbooth-boston
Copy link

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

missing datatype for FHIR primitive types missing Narrative.div duplication of some root RDF properties
5 participants