Skip to content

Commit

Permalink
added support for wiki synonyms
Browse files Browse the repository at this point in the history
  • Loading branch information
hubisan committed Apr 28, 2024
1 parent e5a88fd commit a9042e2
Show file tree
Hide file tree
Showing 2 changed files with 98 additions and 56 deletions.
35 changes: 32 additions & 3 deletions README.org
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@ Retrieve definitions (meanings) and synonyms for German words with Emacs.
** TODOS

*** NEXT Add synonyms from Wiktionary

There is already some code in the file that sort of works.
Also make this work if no synonyms where found on Openthesaurus.

*** TODO Add tests to cover more

Expand Down Expand Up @@ -146,10 +148,37 @@ In any case please check out the [[./CONTRIBUTING.org::*Contributing][contributi

**** Openthesaurus

The text returned can contains additional information in parentheses. Examples:
The text returned can contains additional information in parentheses.

Examples:

#+BEGIN_EXAMPLE
- aufsetzen (Schreiben, Kaufvertrag, ...)
- errichten (Testament, Patientenverfügung, ...)
- (die) Probe aufs Exempel

This information is removed, when reading from the minibuffer. Else it is not removed and inserted into the buffer
#+END_EXAMPLE

This information is removed, when reading from the minibuffer. Else it is not removed and inserted into the buffer.

**** Wiktionary

It appears that when composing synonyms on Wiktionary, users have the a lot of freedom to formulate the text. Therefore, I avoid parsing the synonyms into a list which is used when reading from the minibuffer. Similar to OpenThesaurus, the synonyms are inserted into an Org buffer exactly as they are formulated.

Examples of texts used (word 'geben'):

#+BEGIN_EXAMPLE
- abtreten, reichen, übertragen, vermachen
- aushändigen, hinreichen, in die Hand drücken, übergeben, überlassen, überreichen
gehoben: darbieten, darreichen, zukommen/zuteilwerden lassen
oft gehoben: reichen
bildungssprachlich: präsentieren
umgangssprachlich: langen, rüberwachsen lassen
Papierdeutsch: verabreichen; Papierdeutsch veraltend: verabfolgen
- schenken, gewähren, zum Geschenk machen, zustecken
schweizerisch: vergaben
gehoben: bedenken, beglücken, stiften, zukommen/zuteilwerden lassen
umgangssprachlich: spendieren
gehoben oder ironisch angedeihen lassen
leicht scherzhaft: verehren
veraltet: zueignen
#+END_EXAMPLE
119 changes: 66 additions & 53 deletions woerterbuch.el
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ You most likely only want to change this, if you want to change the number of
newlines.
Format is called with three parameters:
- Stars to start a heading
- Text of the heading
- Text of the heading (see also other customization variables)
- The content"
:type 'string)

Expand Down Expand Up @@ -116,26 +116,31 @@ Format is called with one parameters:
:type 'string)

(defcustom woerterbuch-synonyms-add-synonyms-from-wiktionary nil
"If non-nil synoyms taken from Wiktionary are added (if in org buffer).
"If non-nil synoyms taken from Wiktionary are added.
This is only the case when using a function that displays the synonyms in an
org-buffer. If reading from the minibuffer the synonyms are not added.
The synonyms are added below those of Openthesaurus. The synonyms are not added
if reading from minibuffer."
:type 'string)

(defcustom woerterbuch-process-timeout 5
"Number of seconds to wait for the process to return output."
:type 'integer)

(defcustom woerterbuch-process-python-programm "python3"
"The name of the command to start the python process"
:type 'integer)
(defcustom woerterbuch-quit-window-key-binding "C-c C-k"
"Key binding to use for `quit-window' in the woerterbuch buffer.
If set to nil no key binding is set."
:type 'string)

;;; Major-Mode & Key Bindings

(defvar woerterbuch-mode-map
(let ((map (make-sparse-keymap)))
;; Not needed and makes it possible to not require `org-mode'.
;; (set-keymap-parent map org-mode-map)
(when woerterbuch-quit-window-key-binding
(define-key map (kbd woerterbuch-quit-window-key-binding) 'quit-window))
map))

(define-derived-mode woerterbuch-mode org-mode "Woerterbuch"
"Major mode for displaying woerterbuch buffer.")

(define-key woerterbuch-mode-map (kbd "C-c C-k") 'quit-window)

;;; Global Variables

(defconst woerterbuch--package-directory
Expand Down Expand Up @@ -465,8 +470,7 @@ If TO-KILL-RING is non-nil it is added to the kill ring instead."
(let* ((url (format woerterbuch--synonyms-openthesaurus-api-url
(url-hexify-string (string-trim word))))
(buffer (url-retrieve-synchronously url t)))
(if (not buffer)
(error "Could not retrieve synonyms")
(when buffer
(with-current-buffer buffer
(goto-char (point-min))
(re-search-forward "^$")
Expand Down Expand Up @@ -519,15 +523,17 @@ word is returned as it can differntiate from the WORD used as
parameter when a baseform is used to retrieve the synonyms.
Returns nil if no synonyms are retrieved."
(let* ((raw-synonyms (woerterbuch--synonyms-retrieve-raw word))
(baseform (woerterbuch--synonyms-baseform raw-synonyms)))
(baseform (when raw-synonyms
(woerterbuch--synonyms-baseform raw-synonyms))))
;; If a baseform was found use that to retrieve the synonyms.
(when baseform
(setq raw-synonyms (woerterbuch--synonyms-retrieve-raw baseform)))
(let* ((synonyms (woerterbuch--synonyms-to-list raw-synonyms))
(synonyms (if clean
(woerterbuch--synonyms-clean-text synonyms)
synonyms)))
(cons (or baseform word) synonyms))))
(when raw-synonyms
(let* ((synonyms (woerterbuch--synonyms-to-list raw-synonyms))
(synonyms (if clean
(woerterbuch--synonyms-clean-text synonyms)
synonyms)))
(cons (or baseform word) synonyms)))))

(defun woerterbuch--synonyms-to-string (synonyms)
"Convert the list of SYNONYMS to a string.
Expand Down Expand Up @@ -559,7 +565,7 @@ synonyms."
(text (if synonyms
(format "%s\n" (woerterbuch--synonyms-to-string synonyms))
(format woerterbuch-synonyms-no-matches-text-format
word-used)))
(or word-used word))))
;; Add a heading if needed.
(synonyms-string
(if with-heading
Expand All @@ -582,7 +588,7 @@ Returns nil if no synonym was selected."
(completing-read "Select synonym: " synonyms-sorted nil t))
(user-error "No synonyms found for %s" word)))

(defun woerterbuch--synonyms-wiktionary-in-org-mode-syntax (word)
(defun woerterbuch--synonyms-wiktionary-retrieve-as-string (word)
"Get a list of synonyms from wiktionary for WORD in org-mode syntax.
The WORD needs to be in baseform."
(let* ((url (format woerterbuch--synonyms-wiktionary-url
Expand All @@ -594,37 +600,40 @@ The WORD needs to be in baseform."
(progn
(set-buffer-multibyte t)
(goto-char (point-min))
(when-let* ((start
(1+ (re-search-forward
"\\(>Synonyme:</p>\\|>Sinnverwandte Wörter:</p>\\)")))
(end (search-forward "</dl>"))
(dom (libxml-parse-html-region start end))
(text (dom-texts dom))
;; Change the leading [1] to - for org-mode.
(text-cleaned
(replace-regexp-in-string "\\[[^]]+]" "-" text))
;; Replace spaces with one space.
(text-cleaned
(replace-regexp-in-string " +" " " text-cleaned))
;; Remove space before punctuation.
(text-cleaned
(replace-regexp-in-string "\\( \\)[,:;.]" "" text-cleaned
nil nil 1))
;; Remove space at end of line.
(text-cleaned
(replace-regexp-in-string " $" "" text-cleaned))
;; Remove remarks with Siehe auch
(text-cleaned
(replace-regexp-in-string
"\\(; siehe auch:.*;\\|; siehe auch:.*$\\)" ""
text-cleaned))
;; Second line and following have a space at the beginning.
(text-cleaned (replace-regexp-in-string "^ -" "-"
text-cleaned))
;; Add spaces at the beginning if not starting with -.
(text-cleaned (replace-regexp-in-string "^[^-]" " "
text-cleaned)))
text-cleaned))
(when-let*
((found (re-search-forward
"\\(>Synonyme:</p>\\|>Sinnverwandte Wörter:</p>\\)"
nil t))
(start (1+ found))
(end (search-forward "</dl>"))
(dom (libxml-parse-html-region start end))
(text (dom-texts dom))
;; Change the leading [1] to - for org-mode.
(text-cleaned
(replace-regexp-in-string "\\[[^]]+]" "-" text))
;; Replace spaces with one space.
(text-cleaned
(replace-regexp-in-string " +" " " text-cleaned))
;; Remove space before punctuation.
(text-cleaned
(replace-regexp-in-string "\\( \\)[,:;.]" "" text-cleaned
nil nil 1))
;; Remove space at end of line.
(text-cleaned
(replace-regexp-in-string " $" "" text-cleaned))
;; Remove remarks with Siehe auch
(text-cleaned
(replace-regexp-in-string
"\\(; siehe auch:.*;\\|; siehe auch:.*$\\)" ""
text-cleaned))
;; Second line and following have a space at the beginning.
(text-cleaned (replace-regexp-in-string "^ -" "-"
text-cleaned))
;; Add spaces at the beginning if not starting with -.
(text-cleaned (replace-regexp-in-string "^[^-]" " "
text-cleaned)))
(format "\n[[%1$s][Wiktionary:]]\n\n%3$s"
url word text-cleaned)))
(kill-buffer buffer))))))

;;;###autoload
Expand Down Expand Up @@ -666,8 +675,12 @@ openthesaurus."
(interactive "sWort: \nP")
(let* ((word-and-synonyms (woerterbuch--synonyms-retrieve-as-string
word with-heading))
(word-used (car-safe word-and-synonyms))
(synonyms (cdr-safe word-and-synonyms)))
;; TODO If Wiktonary should be added add this to the string.
(when woerterbuch-synonyms-add-synonyms-from-wiktionary
(when-let* ((wiki-synonyms
(woerterbuch--synonyms-wiktionary-retrieve-as-string word-used)))
(setq synonyms (concat synonyms wiki-synonyms))))
(save-excursion
(woerterbuch--org-insert synonyms with-heading))))

Expand Down

0 comments on commit a9042e2

Please sign in to comment.