-
Notifications
You must be signed in to change notification settings - Fork 4
Sensetion Mode
Alexandre Rademaker edited this page Dec 16, 2019
·
3 revisions
This section is only relevant if you want to generate data from the glosstag corpus. You’ll need SBCL and quicklisp installed.
The script at utils/convert-glosstag.lisp
performs the conversion
from the plist format we previously used.
Example:
CL-USER> (load "utils/convert-glosstag.lisp") CL-USER> (main (directory "~/glosstag/data/*.plist") "~/sensetion.el/data/mongo/corpus.json") NIL
You can create a project for the glosstag corpus like so:
(sensetion-make-project
:name "glosstag"
:backend (sensetion-make-mongo
:db "sensetion"
:synset-collection "synsets"
:document-collection "glosstag")
:display-meta-data-fn (lambda (sent _)
(let ((metas (sensetion--sent-meta sent)))
(format "(%s) %s | "
(map-elt metas 'pos nil #'eq)
(propertize
(string-join
(map-elt metas 'terms nil #'eq)
", ")
'face 'bold)))))
Dictionary can be generated from the `corpus.json`:
cat glosstag.json | jq -c '{_id: .meta.keys[0], lexname: .doc_id, pos: .meta.pos, keys: .meta.keys, terms: .meta.terms | map(gsub(" "; "_")), definition: .text}' > dict.json