Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when does SUMO match exactly the root? #5

Open
vcvpaiva opened this issue Jun 5, 2017 · 5 comments
Open

when does SUMO match exactly the root? #5

vcvpaiva opened this issue Jun 5, 2017 · 5 comments

Comments

@vcvpaiva
Copy link
Contributor

vcvpaiva commented Jun 5, 2017

In the example:

A baby is laughing .

1 A a _ DT Definite=Ind 2 det 2:det U
2 baby baby _ NN Number=Sing 4 nsubj 4:nsubj n09827683:0.3974235736313493,n09827519:0.06167156876081363,n09918554:0.02073727521829126,n09828216:0.009570097613152126,n09827363:0.0052531796421179805,n01322221:0.0032179779444162413,n00796767:0.002126327189859401|HumanBaby=
3 is be _ VBZ Person=3|Tense=Pres 4 aux 4:aux v02604760:0.374466717142241,v02616386:0.11868943406506324,v02655135:0.05115870391683428,v02603699:0.026794267881151218,v02749904:0.020302773581458825,v02664769:0.004231483591539528,v02620587:0.011527470971813582,v02445925:0.008782138965198173,v02697725:0.006057304179186997,v02268246:0.003126857039036189,v02614181:0.004562086622892762,v02744820:0.0025535259373098106,v02702508:9.444234663786115E-4|Entity+
4 laughing laugh _ VBG _ 0 root _ v00031820:1|Laughing=
5 . Definite=Ind _ . _ 4 punct 2:det U

the root matches exactly the SUMO concept, which's a good sign that the representation is good.
in general when the SUMO concept has a equals sign this is a good sign, how many do we have like this?

@vcvpaiva
Copy link
Contributor Author

vcvpaiva commented Jun 9, 2017

more generally: can we have the numbers as in
https://github.com/own-pt/rte-sick/blob/master/expanded/conllu/all.conllu.root.txt?

@kkalouli
Copy link
Owner

Please find all unique roots in the stats folder.

@vcvpaiva
Copy link
Contributor Author

@kkalouli I think it might be better if you have the part of speech for the roots.
this way we can compare number of copulas with roots that are nouns/adjectives and we can see if anything that is neither is considered a root.

in any case I found these that do not seem to me to be ok.
singing 34
front 11
brushing 11
dancing 9
folding 9
rock 8
silent 8
climbing 8
naked 7
drunk 7
diving 7
surround 7
parking 6
grazing 6
empty 6
biking 6
pacing 6
person 6
landing 6

@vcvpaiva
Copy link
Contributor Author

vcvpaiva commented Jun 12, 2017

For example, for "singing & noun" I found:
There is no clown singing and people are not dancing.
There is no clown singing.
A costumed performer is singing and people are dancing.
where the dependencies are not working, as "singing" should've been a verb.

@kkalouli
Copy link
Owner

Please find the new list including the pos tags in the stats folder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants