-
Notifications
You must be signed in to change notification settings - Fork 5
Issue 19 analysis note
Context : in COOS GSBPM is modelled as a SKOS taxonomy, with instances of coos:ActivityCategory, itself a subClassOf skos:Concept. In COOS, GSIM, being a data model, is captured as a set of OWL classes and properties. The question is "how to express the information that a given GSBPM subProcess can (but not must) take as an input instances of given GSIM classes" ? The difficulty is to align SKOS-world entities with OWL-world entities.
# declaration of a defined class under coos:SubProcess
<http://id.unece.org/def/coos/SelectSample_SubProcess> rdf:type owl:Class ;
rdfs:subClassOf coos:SubProcess ;
# formal equivalence with the GSBPM ActivityCategory value
owl:equivalentClass [ rdf:type owl:Restriction ;
owl:onProperty dcterms:type ;
owl:hasValue <http://id.unece.org/activities/subProcess/4.1>
] ;
# Expression of the fact that any items in this set must have GSIM type A or B as a value for its prov:used property
rdfs:subClassOf [
rdf:type owl:Restriction ;
owl:onProperty prov:used ;
owl:allValuesFrom [
rdf:type owl:Class ;
owl:unionOf ( coos:A
coos:B
)
];
]
.
Advantages:
- Formal OWL sets are defined
- The restriction is reasonable (=can be interpreted by a reasoner) and as formal as it can be
Disadvantages:
- This requires to write one new defined OWL classes for each GSBPM type we want to describe. Basically we are duplicating the GSBPM taxonomy as OWL classes. Note: if the GSBPM had been modelled as OWL classes (subclasses of coos:ActivityCategory) then we would not have this problem
- The expression of the constraint is "hard" (yes/no), and instances of such subProcesses with values outside of A or B would be incorrect or yield reasoning errors. In particular subProcesses could take as input something else than GSIM entities.
In this variant the formal equivalent is replaced by a "loose" annotation with no formal semantics.
# declaration of a defined class under coos:SubProcess
<http://id.unece.org/def/coos/SelectSample_SubProcess> rdf:type owl:Class ;
rdfs:subClassOf coos:SubProcess ;
# formal equivalence with the GSBPM ActivityCategory value
owl:equivalentClass [ rdf:type owl:Restriction ;
owl:onProperty dcterms:type ;
owl:hasValue <http://id.unece.org/activities/subProcess/4.1>
] ;
# Annotation indicating that input "includes" (but is not restricted to) those 2 classes
coos:inputIncludes coos:A, coos:B ;
.
The semantics of coos:inputIncludes is inspired by schema:domainIncludes / rangeIncludes (and also dcterms:domainIncludes https://www.dublincore.org/specifications/dublin-core/dcmi-terms/#http://purl.org/dc/dcam/domainIncludes).
Advantages:
- Formal OWL sets are defined
- The restriction is easy to assert
- The restriction is "loose" (instances of other classes are allowed)
Disadvantages:
- This is not OWL-reasonable
- This requires to write one new defined OWL classes for each GSBPM type we want to describe. Basically we are duplicating the GSBPM taxonomy as OWL classes. Note: if the GSBPM had been modelled as OWL classes (subclasses of coos:ActivityCategory) then we would not have this problem
In this variant the link between GSBPM and GSIM is expressed at the SKOS level, on the GSBPM concepts.
# remember coos:ActivityCategory is subClassOf skos:Concept
<http://id.unece.org/activities/subProcess/4.1> a coos:ActivityCategory ;
coos:inputIncludes coos:A, coos:B ;
# coos:outputIncludes ...
Advantages:
- The restriction is easy to assert
- The restriction would be a subPropertyOf skos:mappingRelation. This would implicitely make GSIM classes instances of skos:Concept, as per the range definition of skos:semanticRelation, which is absolutely not a problem
- The restriction is "loose" (instances of other classes are allowed)
Disadvantages:
- This is not OWL-reasonable
@prefix sh: <http://www.w3.org/ns/shacl#>.
@prefix prov: <http://www.w3.org/ns/prov#>.
@prefix coos: <http://id.unece.org/def/coos#>.
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix myshape: <http://id.unece.org/def/coos_shapes#>.
myshape:TaskInsideSubProcess_4_1 a sh:NodeShape ;
sh:target [
sh:select """
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX coos: <http://id.unece.org/def/coos#>
SELECT ?this
WHERE {
# Any task inside a subprocess of type 4.1
?this a coos:Task .
?this (dcterms:isPartOf|^dcterms:hasPart)* ?sp_41
?sp_41 dcterms:type <http://id.unece.org/activities/subProcess/4.1>
}
"""
] ;
sh:property [
# SHOULD have instances of A or B as value for prov:used
sh:path prov:used ;
sh:or (
[sh:class coos:A]
[sh:class coos:B]
)
# This is just a warning
sh:severity sh:Warning ;
]
.
Advantages:
- This is decoupled from the actual OWL ontology and placed in another separate file
- Constraint is expressed as a Warning, not a strong Violation
- This is SHACL-interpretable and can be given to a SHACL validator (e.g. https://shacl-play.sparna.fr/play/)
Disadvantages:
- Requires to write a maintain one rule per subProcess type
- Link between GSBPM and GSIM is hidden in not part of the GSBPM taxonomy
This is a combination of variants 3 and 4.
In the OWL file:
# remember coos:ActivityCategory is subClassOf skos:Concept
<http://id.unece.org/activities/subProcess/4.1> a coos:ActivityCategory ;
coos:inputIncludes coos:A, coos:B ;
# coos:outputIncludes ...
And in the SHACL file:
@prefix sh: <http://www.w3.org/ns/shacl#>.
@prefix prov: <http://www.w3.org/ns/prov#>.
@prefix coos: <http://id.unece.org/def/coos#>.
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix coosshapes: <http://id.unece.org/def/coos_shapes#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
# 1. Declare a constraint component
coosshapes:CheckCoosInputIncludesComponent
a sh:ConstraintComponent ;
# declare the parameter to use to trigger that constraint
sh:parameter [
sh:path coosshapes:isIncludedInput ;
sh:name "is included input" ;
sh:description "Set to true to verify that the value has a class indicated by the coos:inputIncludes annotation on the dcterms:type of this instance" ;
sh:datatype xsd:boolean
] ;
sh:labelTemplate "The input does not have an expected type" ;
# link to validator
sh:propertyValidator coosshapes:CheckCoosInputIncludesValidator .
# 2. The corresponding validator
coosshapes:CheckCoosInputIncludesValidator
a sh:SPARQLSelectValidator ;
sh:message "{$PATH} is not of one of the expected inputs" ;
# The SPARQL query
sh:select """
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX coos: <http://id.unece.org/def/coos#>
SELECT ?this ?value WHERE {
# Check the class and super-classes of the value of the property
# "?this" will be bound to the node being checked, in our case most often the task subprocess
# "?PATH" will be replaced by the value of sh:path, in our case most often prov:used
?this ?PATH ?value .
?value rdf:type/rdfs:subClassOf* ?valueClass .
# Trigger a violation if the value class cannot be found as a value of
# coos:inputIncludes on the dcterms:type of the task/subprocess
FILTER NOT EXISTS {
?this dcterms:type ?gsbpmConcept .
?gsbpmConcept coos:inputIncludes ?valueClass .
}
# This is just so that if the parameter is set to false in the shape,
# The validation will not be triggered
FILTER($isIncludedInput)
}""" .
# 3. Define our shape with this constraint component
coosshapes:SubProcess a sh:NodeShape ;
# shapes apply to all coos:SubProcess
sh:targetClass coos:SubProcess ;
sh:property [
# This rule will trigger a simple Warning, this is not a strong Violation
sh:severity sh:Warning ;
# on property prov:used (this will be used as the value of ?PATH variable)
sh:path prov:used ;
# checks that the rdf:type of values are correct according to our rule
coosshapes:isIncludedInput true ;
]
.
Advantages:
- This is a combination of variants 3 and 4
- Link between GSBPM and GSIM is really part of the GSBPM taxonomy
- Constraint is expressed as a Warning, not a strong Violation
- This is SHACL-interpretable and can be given to a SHACL validator (e.g. https://shacl-play.sparna.fr/play/)
Disadvantages:
- none :-)
This can be tested with the following test data, which will trigger one warning (copy paste SHACL and data into form at https://shacl-play.sparna.fr/play/validate):
@prefix prov: <http://www.w3.org/ns/prov#>.
@prefix coos: <http://id.unece.org/def/coos#>.
@prefix dcterms: <http://purl.org/dc/terms/>.
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix ex: <http://exemple.fr#>.
# Some example data
<http://id.unece.org/activities/subProcess/4.1> a skos:Concept ;
coos:inputIncludes coos:Dataset, coos:Product ;
.
# This one is correct
ex:myCorrectSubProcess a coos:SubProcess ;
dcterms:type <http://id.unece.org/activities/subProcess/4.1> ;
prov:used ex:myDataset ;
.
ex:myDataset a coos:Dataset .
# This one is incorrect
ex:myINCorrectSubProcess a coos:SubProcess ;
dcterms:type <http://id.unece.org/activities/subProcess/4.1> ;
prov:used ex:somethingElse ;
.
ex:somethingElse a ex:AnotherType .