-
Notifications
You must be signed in to change notification settings - Fork 0
LearningFromEuroparl
alexr edited this page Jan 18, 2013
·
5 revisions
- provided by Els
- We apparently don't need to do sentence alignment ourselves!! Convenient.
- We might want to do word alignment. We're going to have to, at least, figure out which sentences are good training data.
- TODO(alexr)
- We're going to have to do this to get training data at all. What's the best/easiest aligner to use on Europarl?
This would be an interesting argument against taking WSD as a separate task in MT at all; what if we got better results just calling an MT system on the input text? "Oh no, Joshua does better than your carefully-crafted classifiers!"
There's a lot more text in the full Europarl v7 corpus than what we get in the sentence-aligned intersections...
So maybe what we could do is sentence-align all the available text, train on that, get out the best answers that we can, and then if they're not senses that are in the intersection used by Els, get the best sense that is used by Els.