Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support sense-to-synset relations #957

Open
jmccrae opened this issue Jul 4, 2023 · 3 comments
Open

Support sense-to-synset relations #957

jmccrae opened this issue Jul 4, 2023 · 3 comments
Labels
release format This issue refers to the WNDB or RDF export, so no changes will be made to this repository
Milestone

Comments

@jmccrae
Copy link
Member

jmccrae commented Jul 4, 2023

Sense to synset relations are useful and would help to solve issues such as in #732

This issue is to verify that we can support such relations and that all relevant tooling (validation, EWE, en-word.net) and release formats (esp. WNDB) work with such relations.

Note that there is no chance that WNDB will support this as it is legacy format, so in this case, we will simply export as sense relations referring to the first member of the target synset

@jmccrae jmccrae added the release format This issue refers to the WNDB or RDF export, so no changes will be made to this repository label Jul 4, 2023
@jmccrae jmccrae added this to the 2023 Release milestone Jul 4, 2023
@rhdunn
Copy link

rhdunn commented Jul 6, 2023

The source/target word numbers in a WNDB pointer are 1-based. Technically, this allows for nn00 for sense-to-synset and 00nn for synset-to-sense pointer relationships. The wndb docs mention 0000 being used for semantic relations (synset-to-synset). When discussing lexical relations (sense-to-sense), it states that word numbers start at 1.

The question is then how WNDB tools will handle these.

It is interesting to note that WordNet Search shows lexical relationships as sense-to-synset relationships, not as sense-to-sense relationships (e.g. http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=1&o3=&o4=&s=peripherally&i=4&h=11000#c). This is displayed incorrectly according to the format definition. I don't know if there are any examples from WordNet 3.1 where sense-to-synset is intended.

@jmccrae
Copy link
Member Author

jmccrae commented Jul 7, 2023

Yes, that is interesting, we could try to use codes like 00nn for this, however my fear is that it would break a large number of tools that do not expect such a code.

There are certainly relations that come from Princeton WordNet, which could be modelled as sense-to-synset, e.g., 'scallion' is linked to 'United States' but could equally apply to other members of this synset (USA, America, etc.).

@rob-ross
Copy link

rob-ross commented Feb 15, 2025

Well, I think there already is a relationship between Sense and Synset and it is called "membership" and it's already modeled in the xml as e.g.:

<Sense id="oewn-position__1.15.03.." synset="oewn-08639776-n">

whereby a particular Sense of a lemma is mapped to Synset. That's clearly a Sense->Synset relationship, right?

For this request I presume you are talking about adding the ability in SenseRelation entries such as:

<SenseRelation relType="derivation" target="oewn-position__2.38.00.."/>

to have "target" be a Synset instead of a Sense?

Has there been any discussion about how to implement this yet? You could try to encode the target entity in the format of the id field (the value of target is the id of another entity, currently only Sense. In the database world we'd call that a foreign key to Sense.) So you could have a Sense id formatted clearly differently to a Synset id, so it would be obvious what entity target refers to. I instinctively don't like this. It smells bad to me.

I suppose backwards compatibility precludes us from changing the attribute name "target." I assume all existing codes expects a "target" attribute in a SenseRelation and SynsetRelation.

Otherwise we could remove "target" and introduce two new attributes "senseFK/sense_fk" and "synsetFK/synset_fk". In converting existing data files you'd just need to replace all occurrences to the string "target" in a SenseRelation with "sense_fk" and in a SynsetRelation with "synset_fk", representing the status quo with regards to referencing another entity.

Although this would be easy to change in data files, I assume it would be an untenable goal by breaking all existing code.

So my next idea is to introduce a new attribute on SynsetRelation and SenseRelation to indicate to which entity the "target" value refers.

1.

<!ATTLIST SynsetRelation
    target IDREF #REQUIRED
    targetFK (synset|sense) #IMPLIED
    relType ...
        
OR

2.
      
<!ATTLIST SynsetRelation
    target IDREF #REQUIRED
    targetFK (synset|sense) "synset"
    relType ...

I'm not 100% sure you can have an enumerated value type and also make it optional. But if so, example 1 would be the ideal definition, as making this change would at first only affect the DTD and no existing code nor xml data files. Example 2 would add a new attribute for SynsetRelation entities when parsing the xml files. Well written code shouldn't break on this addition and the new attribute could be ignored. The default value would also keep the status quo, that a target refers to the same entity as its parent entity. Similarly for SenseRelation we would have the following, with the targetFK default changed to keep the targetFK pointing to the same entity type:


 <!ATTLIST SenseRelation
     target IDREF #REQUIRED
     targetFK (synset|sense) #IMPLIED
     relType ...

 OR

 <!ATTLIST SenseRelation
     target IDREF #REQUIRED
     targetFK (synset|sense) "sense"
     relType ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release format This issue refers to the WNDB or RDF export, so no changes will be made to this repository
Projects
None yet
Development

No branches or pull requests

3 participants