Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency parsing: is there some way to discourage multiple nsubj dependents? #1340

Open
nschneid opened this issue Jan 30, 2024 · 8 comments
Labels

Comments

@nschneid
Copy link

A very weird English tree produced by Stanza 1.6.0 in the demo:

My cousin my extremely rude colleague admired last year chewed the chicken enthusiastically.

image

In UD, no word is allowed to have multiple (plain) nsubj dependents. But "admired" has two.

Is there a recommended alternate training or decoding method in Stanza that could avoid this sort of problem?

@AngledLuffa
Copy link
Collaborator

AngledLuffa commented Jan 30, 2024 via email

Copy link

stale bot commented Jan 21, 2025

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 21, 2025
Copy link

stale bot commented Jan 31, 2025

This issue has been automatically closed due to inactivity.

@stale stale bot closed this as completed Jan 31, 2025
@AngledLuffa AngledLuffa reopened this Feb 22, 2025
@stale stale bot removed the stale label Feb 22, 2025
@AngledLuffa
Copy link
Collaborator

stalebot nonwithstanding, i find myself wondering if this is possible myself

so here's a brief summary of what the dependency parser does for finding a graph. first it calculates the weights of all possible edges, then it finds the "minimum spanning arborescence" for that graph. it does that using the chu-liu-edmonds algorithm.

we could theoretically special case nsubj so that: if the parser produces two (or more) nsubj for a particular node, it systematically replaces each cost except one with infinity, then reruns the algorithm. whichever wins the new search is the chosen graph.

such constraints would not really generalize to every possible validation error, but having two nsubj is clearly a problem.

reasonable? not worth the effort? any other suggestions on how to approach this, given the brief summary of our dependency parser at the top?

@nschneid
Copy link
Author

One way to do this would be to use a decoding algorithm that allows higher-order constraints (that take into account pairs of edges), e.g. via an ILP as in TurboParser.

we could theoretically special case nsubj so that: if the parser produces two (or more) nsubj for a particular node, it systematically replaces each cost except one with infinity, then reruns the algorithm. whichever wins the new search is the chosen graph.

I don't think it is theoretically optimal/sufficient to assign, one at a time, infinite cost to just the nsubj edges predicted in the original parse. Removing one of those edges could force a restructuring of the tree in the next best parse, and then there could be multiple nsubj edges for a different predicate. But this strategy may be a good enough band-aid in practice.

@AngledLuffa
Copy link
Collaborator

via an ILP as in TurboParser.

Sorry, is that integer linear programming? I'm not at all familiar with turboparser.

Removing one of those edges could force a restructuring of the tree in the next best parse, and then there could be multiple nsubj edges for a different predicate.

Agreed. It could also be the case that two nodes in the initial parse have two nsubj, in this case we'd have to go through both nodes anyway. However, it might adequately handle the most common error case

@AngledLuffa
Copy link
Collaborator

actually it's even worse than what i suggested above, since at the time of the tree decoding, the model has already thrown away information on 2nd best arcs. there really isn't anything that interacts between the arcs in terms of nsubj means something else is more likely to be obj, for example

if you have a suggestion for a replacement algorithm (which isn't hugely slower than our current), happy to look into it. i can also ask around in the theory department next week. i haven't found something obvious by googling, and this particular problem never came up in my algorithms classes...

@nschneid
Copy link
Author

nschneid commented Feb 22, 2025

I would not look for the 2nd best edge - I would parse the entire sentence again but blocking an individual edge at a time.

The theory people will probably tell you to use an integer linear programming parser instead. https://www.cs.cmu.edu/~nasmith/LSP/ explains different structured prediction algorithms in detail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants