Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align LD vs plain-json Qualified relationships representations. #8

Open
Fak3 opened this issue Nov 28, 2019 · 3 comments
Open

Align LD vs plain-json Qualified relationships representations. #8

Fak3 opened this issue Nov 28, 2019 · 3 comments

Comments

@Fak3
Copy link

Fak3 commented Nov 28, 2019

In short, currently documented way to represent qualified relationships as flattened Relationship+Entity object forces us to introduce some challenges for the Linked Data consumers.

Doc on the relationship class says:

API Serialisation Note
The properties of association classes are added as extra properties of the target class during serialisation. So, in the example above, the JSON schema would define an object TransportMovementParty as an arraye property of TransportMovement and the properties of the array object would be Role (from the relationship class) and Identification, Name (from Party class).

In the essence, it says that the realtionship attributes are flattened onto the attributes of the object (object that being related) so that they are mixed together into one json object.

As I started to look into it, there seems to be a problem with mapping such plain json to the json-ld data model. In short, the @context magic of json-ld is not almighty, especially about the flattened represntations of mixed entities and in this case will not allow to build a correct set of triples, which represent such relationship in linked data way, without losing (presumably important) relationship semantics (qualifier), or moving the properties to the inapropriate place. I will have to touch topic of representing qualified relationships in json-ld and important differences for two cases - simple limited qualifier, and unbound qualifiers.

Example

Using currently documented approach, the plain json serialization will look like following.
Lets consider the example model from this uml file. Here TransportMeans has meansParty relationship to the TransportParty with a relationship qualifier, role. So the example Vessel serialized json response can be this:

Example 1
{
"identification": "http://maersk.com/vessels/01",
"meansParty": [
  {
    "identification": "http://maersk.com",
    "name": "Maersk",
    "role": "owner",
  }
 ]
}

As described in the uml file, the realtionship qualifier, role can have a value, picked from a fixed enumeration known in advance: [owner, operator, etc...].

Representing qualified relationships in json-ld

Simple case

Single qualifier

I would first describe the most-straightforward and obvious way to handle the case when there is a single relationship qualifier which has limited set of allowed values, known by a modeller upfront (as we have in the example model, with qualifier limited by enumeration).
Vocabulary can be adequately modeled in rdf as just a property, optionally based on more generic one. i.e

# This vocabulary should be placed at http://edi3.org/vocab
# Define RDF property for specific, qualified relation of two entities.
ownedBy a rdf:Property.
# (optional) Define base property, for generic relationships.
meansParty a rdf:Property.
# (optional) State that specific peperty is based on generic one.
ownedBy rdfs:subPropertyOf meansParty.

(In reality we would probably also want to specify the properties domain and range and make adequate hierarchy of them)
So that plain json example given previously will be properly represented as a single triple:

<http://maersk.com/vessels/01> <ownedBy> <http://maersk.com>.

to match the plain json example lets also add one triple, assigning a name to the organization:

<http://maersk.com> <name> "Maersk".

and corresponing json-ld will be:

Example 2
{
"@context": {"identification": "@id", "owner": "https://edi3.org/vocab/#ownedBy"},
"identification": "http://maersk.com/vessels/01",
"owner": [
  { 
    "identification": "http://maersk.com", 
    "name": "Maersk"
  }
 ]
}

Side note: In this json-ld i have introduced inline context for clarity. In practice once we publish our standards and LD vocabulary, we will also build a static json-ld context, and place it at the well-known location on our site. We would recommend users of edi3 standard reuse that context, in the simplest case by adding just @context: https://edi3.org/contexts/v1. This will be enough to magically map plain json responses to the LD triples for those clients who care about it.

In the json-ld example you can see how its structure slightly differs from the plain json one. In the plain json we use the generic relationship as a key (property) of the root object: meansParty, instead of the more specialized (qualified) one - owner, and additionally have the keyvalue pair role: owner (relationship qualifier) flattened onto the object, representing the Maersk org.

It would be nice if the json-ld spec had a @context magic, allowing for reconciling the two approaches, but as of current version 1.1 it does not. One (undesirable) way we could take is to only have a base, generic property for relation between those objects in json-ld vocabulary, ignoring the role: owner and losing that information, which I believe will be important for LD-capable consumers.

Multiple qualifiers limited by a fixed set of values

For example role (owner/operator) and shared (yes/no) we theoretically could approach modelling an LD vocabulary the same way we do it in a simple case described above, but the number of qualified properties to introduce will be M*N, in our example 4 total: exclusivelyOwnedBy, exclusivelyOperatedBy, sharedOwnedBy, sharedOperatedBy. When number of qualifiers and their possible values is too high, introducing so many specialized properties may be undesirable, and it would be worth to consider modelling a relationship as first-class entity, similar to the cases of unlimited relationship qualifiers described below.

Complex cases

  • The qualifier value is not limited by the set of possible values, for ex dateStarted (type: Date)
  • Too many qualifiers (probably rare case)

I don't actually know if we will encounter such case in the reality, so there is (a little) hope that we can spare this headache for the other day.

It is possible to model this as LD by introducing a first-class entity for the relationship, which we can attach qualifiers to.
For example lets consider a relationship with two qualifiers: role (owner/operator/...) and dateStarted (type: Date). The vocabulary to model this would include a class for the Relationship, two properties that can be used to link Party and TransportMeans entities with the relationship entity, and two properties for qualifiers:

# This vocabulary should be placed at http://edi3.org/vocab
TransportMeansPartyRelationship a rdfs:Class .

party a rdf:Property ;
        rdfs:range Party .

transportMeans a rdf:Property ;
        rdfs:range TransportMeans .

partyRole a rdf:Property ;
        rdfs:range xsd:string .

dateStarted a rdf:Property ;
         rdfs:range xsd:date .

I also added rdfs:range restrictions to the properties to clarify its purpose. In reality we would probably want to restrict range of partyRole to fixed values instead of plain strings, also restrict domain of some of the properties, and maybe have base class for the relationship, etc.

Example of such a relationship between two entities will be represented with five triples total. I used the blank node identifier for the relationship which is the standard way in Turtle to represent entity for which it is not important to have globally-unique identifier. But if would actually be important to have a url for the relationship itself, then instead of _:b1, it could have a value like http://maersk.com/vessels/1/parties (for ex this could allow to modify the relationship by performing request to the resource at this url):

# State that _:b1 is a relationship.
_:b1 a TransportMeansPartyRelationship ;
    # Add links for both entities to this relationship.
    party <http://maersk.com> ;
    transportMeans <http://maersk.com/vessels/1> ;
    # Add qualifiers to this relationship. 
    partyRole "owner" ;
    dateStarted "2019-01-23" .

to match the plain json example lets also add one triple, assigning a name to the organization:

<http://maersk.com> <name> "Maersk".

There are multiple ways to represent the same example in json-ld, here i used the reversed-property feature of json-ld context to make the structure closer to the initial example of plain json:

Example 3
{
"@context": {
  "identification": "@id", 
  "meansParty": {
     "@reverse": "https://edi3.org/vocab/#transportMeans", 
     "@type": "https://edi3.org/vocab/#TransportMeansPartyRelationship"
  },
  "role": "https://edi3.org/vocab/#partyRole", 
},
"identification": "http://maersk.com/vessels/01",
"meansParty": [
  { 
    "party": {
       "identification": "http://maersk.com", 
       "name": "Maersk"
    },
    "role": "owner",
    "dateStarted": "2019-01-23"
  }
 ]
}

To note the difference of the structure in this example and plain json one: here the name and identification of a party are nested under the party. Here the magic of @context can't help us to reconcile the structural difference with our plain json, ie by flattening the party onto the relationship object, without changing the underlying triples structure. But we may give away some brevity if we were to strictly hold onto that plain json structure. Namely, in the LD representation we can allow to declare the properties of a Party on the TransportMeansPartyRelationship itself. In this example, we can move the name property, replacing this thiple <http://maersk.com> <name> "Maersk" . with this _:b1 <name> "Maersk" .. This (questionable) trick will allow us to leave the plain json uncahnged as it is now (we will only have to adjust the json-ld context a bit). I suspect that in the example given here we can almost painlessly get away with this, but not sure if it will not hit us with more complex examples we may encounter. If we go with this, we could probably consult the potential users of our standards how do they feel about semantic shifts like this, how dramatically does it change the meaning of the received data, and how more complex it will be to process it in each such case we encounter.

So if we were to insist on leaving the plain json for qualified relationships unchanged, we should accept stretches we introduce:

  1. In the LD model, even the simple single-qualified relationship will be represented as first-class entity, with qualifiers and linked entities attached to it, instead of the more compelling single-triple way described in the simple case above.
  2. Allow some properties which belong to the Entity to be declared on the Relationship. This may surprise someone looking at the data.

Both points should be considered in terms of how much more complex we will make the life of LD-capable servers/clients in favor of plain-json ones.

Proposed solution: reconciling by changing the plain-json structure

Simple case

In the simple case already described above, when there is a single relationship qualifier which has limited set of allowed values, known by a modeller upfront, we could change the uml serialization rules so that plain-json matches the structure already given in Example 2. Namely, the value of the relationship qualifier becomes the relationship key on the root object. i.e JSON key on the root object for the relationship should be owner or operator (or any other allowed role from the enumeration) instead of meansParty.

Complex case

If we ever encounter it, we can resolve by changing the uml serialization rules so that plain-json matches the structure already given in Example 3, i.e instead of flattening the properties onto the relationship, nest the properties that belong to the Entity under the entity key:

"party": {
       "identification": "http://maersk.com", 
       "name": "Maersk"
 },

To this point, I am in favor of changing the plain json structure, as it does not seem to introduce much pain for both worlds. But surely let's discuss it, and see if there are drawbacks.

@Fak3
Copy link
Author

Fak3 commented Nov 28, 2019

cc @onthebreeze @nissimsan

@onthebreeze
Copy link
Contributor

good work - but I'm going to need a bit of time to digest this.

@onthebreeze
Copy link
Contributor

"So if we were to insist on leaving the plain json for qualified relationships unchanged, we should accept stretches we introduce.."

I would say that we do not insist on that. It's just an accident of first pass generation rules for plain JSON. We are doing this JSON-LD work so that we can discover these kind of inconsistencies.

So, yes, I agree, lets change the plain JSON generation rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants