Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import of sense alignmnents using temporary tables sometimes fails #129

Open
judithek opened this issue May 7, 2015 · 0 comments
Open

Comments

@judithek
Copy link
Member

judithek commented May 7, 2015

for example when using
GermaNetWiktionaryDeAlignment

and adapting it to querying a database containing OntoWiktionary instead of Wiktionary (different Uby sense IDs, but same original sense IDs).

Problems:

  1. all pairs in SenseAxis are wrong
  2. for some strange reason, the import script also pairs senses which are both from OntoWiktionary

example:

ad 1)
-> Misskredit, Blockflöte

<SenseAxis id="GN9_OntoWktDE_16" 
senseOne="GN_Sense_18545" senseTwo="OntoWktDE_sense_53381" senseAxisType="monolingualSenseAlignment"/>

-> Therapieform, Blockhaus

ad 2)

ChM: One crucial problem is that OntoWiktionary != Wiktionary. So far, we keep the old 2011 dump version of Wiktionary around, mainly because we haven't replaced the original word sense alignment I've created in 2011 with a newer one based on DWSA. OntoWiktionary, however, makes use of a 2013 dump and uses a different JWKTL version. Thus, the original sense IDs are NOT compatible. This should explain why all SenseAxis pairs are wrong (Sorry, I could have raised this earlier, but I thought that using the new alignment framework we had newly created, OntoWiktionary-specific alignments).

This of course does not explain why in some cases two OntoWiktionary senses are aligned. I cannot say much about that, but probably there is a lexicon check missing? It is possible that an original sense ID of OntoWiktionary and matches an original ID from a different resource. It is therefore crucial to check the lexicon (respetively, the external system identifer). If that's not the issue, than there's of course the chance for a major bug in the software - I did not check any source code before filling up this textarea...

JEK:

OntoWiktionary, however, makes use of a 2013 dump and uses a different JWKTL version.
Thus, the original sense IDs are NOT compatible.
I am aware of that. Yet, the alignment of the original sense IDs appears to be still (mostly I guess) valid (I did not find a wrong alignment yet when hand picking arbitrary pairs and looking them up via their MonolingualExternalRefs) iff it is imported via the Uby API (which checks for original sense ID AND external system).

However the import via temporary tables fails as described. This might indeed be caused by not checking for expernalSystem - but the database that I used for looking up original sense IDs contained only GermaNet, WordNet and OntoWiktionary - no other lexicon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant