Skip to content

Sensetion Mode

Alexandre Rademaker edited this page Dec 16, 2019 · 3 revisions

Glosstag Corpus

This section is only relevant if you want to generate data from the glosstag corpus. You’ll need SBCL and quicklisp installed.

The script at utils/convert-glosstag.lisp performs the conversion from the plist format we previously used.

Example:

CL-USER> (load "utils/convert-glosstag.lisp")

CL-USER> (main (directory "~/glosstag/data/*.plist") "~/sensetion.el/data/mongo/corpus.json")
NIL

You can create a project for the glosstag corpus like so:

(sensetion-make-project
 :name "glosstag"
 :backend (sensetion-make-mongo
	     :db "sensetion"
	     :synset-collection "synsets"
	     :document-collection "glosstag")
 :display-meta-data-fn (lambda (sent _)
			   (let ((metas (sensetion--sent-meta sent)))
			     (format "(%s) %s | "
				     (map-elt metas 'pos nil #'eq)
				     (propertize
				      (string-join
				       (map-elt metas 'terms nil #'eq)
				       ", ")
				      'face 'bold)))))

dictionary

Dictionary can be generated from the `corpus.json`:

cat glosstag.json | jq -c '{_id: .meta.keys[0], lexname: .doc_id, pos: .meta.pos, keys: .meta.keys, terms: .meta.terms | map(gsub(" "; "_")), definition: .text}' > dict.json
Clone this wiki locally