Skip to content
alexr edited this page Mar 6, 2013 · 5 revisions

word features

  • the word form itself (this is more important than you think)
  • the tag of the word
  • the lemma of the word (how could this help?)

But the intuition here is that "Bank" is different from "banks" is different from "bank".

context features

  • bag-of-word features? (skipping these for the English case for the moment; Els uses bag of words for the )

  • within a window of size 3 on either side...

    • word form
    • word lemma
    • pos tag.

If we had more time, we'd also use a chunker/parser. Maybe see if that helps next time.

Consider the problem: do we want to add the words with their tags as features? (we're doing that) Can saw it actually hurt performance to include just the tags as separate features, on a WSD system last semester.

Clone this wiki locally