Skip to content

mxpiotrowski/postagger.el

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

postagger: Emacs interface to a part-of-speech tagger

postagger provides an interface to a part-of-speech tagger, currently MBT (Memory-based tagger generator and tagger). The primary interface is the function postagger-tag-sentence, which feeds the current sentence to the tagger and attaches the POS tags returned by the tagger as text properties to the word forms of the sentence.

I wrote this code in 2009 for the LingURed project. It is highly experimental and was originally for XEmacs; the current version is a quick port to FSF Emacs, but it still requires the deprecated levents package.

Put this in your .emacs to make the functions of this library available:

(require 'postagger)

The settings files for Mbt to use are (obviously) language-dependent. Specify the settings file for each language you’re using in the variable `postagger-settings-files’. The settings file to use is selected on the basis of the current language environment. Use the function set-language-environment to correctly set the language environment.

postagger provides a customize interface to set all relevant options.

postagger is not really useful by itself, but it is intended to provide infrastructure for linguistically supported editing functions. postagger-tag-sentence will then probably be run automatically by some hook.

However, postagger-tag-sentence can be called interactively. For testing you may want to bind postagger-tag-sentence to a key combination, e.g., C-c p:

(define-key text-mode-map [(control c p)] 'postagger-tag-sentence)

If you’re using AUCTeX, you may want to add:

(add-hook 'TeX-mode-hook
        (lambda ()
          (define-key LaTeX-mode-map [(control c p)]
            'postagger-tag-sentence)))

If you’re using Gnus, you may also want to add:

(define-key message-mode-map [(control c p)] 'postagger-tag-sentence)

Todo

  • Should we use a buffer instead of the global postagger-output variable?
  • Should we specify a sentinel for the process?
  • One could also use a transaction queue for communicating with the process. Would this be better?

Releases

No releases published

Packages

No packages published