Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROBOT query --update strips prefixes when output is not .owl (RDF/XML) #1172

Open
allenbaron opened this issue Dec 1, 2023 · 3 comments
Open

Comments

@allenbaron
Copy link
Contributor

The robot query command performing an update operation drops prefixes when the output is .ofn or .omn but not .owl (all I tested). This seems to be the same issue as #1101 except it's happening for robot update queries and wasn't fixed by PR #1106 (still happens in 1.9.5). The doid-edit.owl input file is formatted as .ofn. This happens for all SPARQL update queries I've tried (including a completely empty one, see bottom).

Prefixes dropped

.ofn output loses prefixes:

robot query -i doid-edit.owl --update fix_whitespace.rq -o tmp.ofn \
    && mv tmp.ofn doid-edit.owl

Chaining convert doesn't help:

robot \
    query -i doid-edit.owl --update fix_whitespace.rq \
    convert -o tmp.ofn \
    && mv tmp.ofn doid-edit.owl

Separate convert doesn't help (for .ofn or .omn):

robot query -i doid-edit.owl --update fix_whitespace.rq -o tmp.omn \
    && robot convert -i tmp.omn -o doid-edit.owl --format ofn \
    && rm tmp.omn

Result:
image

Prefixes Preserved

.owl output preserves prefixes:

robot query -i doid-edit.owl --update fix_whitespace.rq -o tmp.owl \
    && robot convert -i tmp.owl -o doid-edit.owl --format ofn \
    && rm tmp.owl

Using --add-prefixes also works (my current workaround):

robot --add-prefixes prefixes.json \
    query -i doid-edit.owl --update fix_whitespace.rq -o tmp.ofn \
    && mv tmp.ofn doid-edit.owl

SPARQL queries

fix_whitespace.rq:

# remove extra whitespace from ALL strings (e.g. in defs, xrefs, labels, etc.)
#  -> removes 2+ spaces, spaces before commas or periods, and spaces at beginning or end of string
PREFIX xsd:  <http://www.w3.org/2001/XMLSchema#>

DELETE { ?s ?p ?o . }
INSERT { ?s ?p ?new_o . }
WHERE {
    ?s ?p ?o .
    FILTER( datatype(?o) = xsd:string )
    BIND( 
        REPLACE(
            REPLACE(?o, " (,) *| +", "$1 "),
            " (\\.)| +$|^ +", "$1"
        ) AS ?new_o
    )
}

empty sparql update query:

DELETE {  }
INSERT {  }
WHERE {
    ?s a owl:Class .
}

prefixes.json file

{
  "@context": {
    "obo": "http://purl.obolibrary.org/obo/",
    "oboInOwl": "http://www.geneontology.org/formats/oboInOwl#",
    "rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
    "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
    "xml": "http://www.w3.org/XML/1998/namespace",
    "xsd": "http://www.w3.org/2001/XMLSchema#",
    "owl": "http://www.w3.org/2002/07/owl#",
    "terms": "http://purl.org/dc/terms/",
    "dc": "http://purl.org/dc/elements/1.1/",
    "skos": "http://www.w3.org/2004/02/skos/core#",
    "doid": "http://purl.obolibrary.org/obo/doid#"
  }
}
@jamesaoverton
Copy link
Member

Thanks for pointing to #1106, which uses isPrefixOWLOntologyFormat() to check whether a format should use prefixes. That should be correct. In this case robot query is converting the input ontology to Turtle, loading into Jena, running SPARQL, converting back to Turtle, and reading in to OWLAPI again. I guess that the format of the input ontology is being lost. If I'm right, then the prefixes won't be preserved for RDFXML format either, but we might be setting decent prefixes in that case.

Do you (or anyone reading this) have time to dig into this issue? I have some big deadlines coming up.

@allenbaron
Copy link
Contributor Author

I'd love to help more but I don't have sufficient expertise with Java (or sufficient familiarity with the internal workings of ROBOT/OWLAPI) to delve into this. My apologies.

@matentzn
Copy link
Contributor

Seems @souzadevinicius is interested to look at this, but its actually a quite complex issue possibly - we will see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants