Skip to content
alexr edited this page Mar 6, 2013 · 1 revision

where do we get the edge penalties?

Possibilities:

  • they are joint probabilities of (for example) (es1, de1) out of all the labels.
  • Alternatively, they are conditional probabilities, P(de|es) for the message from the German node to the Spanish one. "Here's how bad it would be, for me, if you picked Spanish word es1".

question: what do you do for an edge penalty if one of the words is unseen? (this is if we're using conditional probabilities)

  • What if the Spanish word is unseen, in that de->es message example?
    • well, could we back off to P(de)? Given no other knowledge, just use what we think is the probability of that German word without the Spanish context?
    • Alternatively, for an unseen Spanish word, should we give it a large penalty? "I will not be happy if you pick that word I've never seen before".