This HACKING file describes the development environment.
Copyright (C) 2012 Swiss Library for the Blind, Visually Impaired and Print Disabled
Copying and distribution of this file, with or without modification, are permitted in any medium without royalty provided the copyright notice and this notice are preserved.
The prep tool is used as an aid to enrich DTBook XML with markup needed to render Braille. It is implemented as a plugin to the oXygen XML editor.
The current implementation of the plugin does the markup enrichment directly in the text and does not parse the XML. This has some problems in that we have to deal with gruesome regular expressions which try to match across the XML tags (see the whole business with the skipper and the interval tree, trying to hide the XML tags). So on the one hand this is a drawback. On the other hand it keeps us from having to synchronize the document tree with the text, as while we are manipulating the document tree the user might at the same time modify the text.
Deals with initializing the (Swing) Menu bars, tool bar, etc. It also makes sure that some menu items which are implicitly resetting the state of a document are wrapped, so that the prep tool can be notified of these events. These are the “Revert” and the “Save as” menus. Similarly when switching to another prep tool while the current one is still active we need to make sure that the user is warned and the state of the prep tool is reset.
Group a bunch of edits so that they can be undone with one undo
Store info such as current active prep tool, state of each prep tool or current cursor position. This is needed to reflect it in the tool bar but also for example to check if the user changed the cursor position behind the prep tools back.
Not needed in favor of the MenuPlugger
Allows you to do multiple replacements inside a Document (javax.swing.text.Document). This allows you to do a more intelligent modification of the text.
Implements basename
Implements search in the context of skippable regions. Also contains utility methods to handle xml start and end tags.
Simple data structure for region, i.e. start and end points.
Implements regions in a document which are immune to changes. This is done using the functionality of a swing document which implements this.
Manage meta info in the XML. Namely enables to put the information which prep tool has been run for this document into the meta data section of the document. This is not used currently but there is a pending feature request which is implemented using this functionality.
Helper methods for finding unbalanced parens
Utility functions to read version info from a properties file.
- https://github.com/bwagner/interval-tree
- as forked from https://github.com/dyoo/interval-tree
- https://github.com/bwagner/wordhierarchy
The following is a short description of each individual preptool.
Since there is an idea to implement the preptools using XSLT a discussion of feasibility of such a migration is included. The idea is to replace the preptools written in Java by one generic preptool that knows how to treat specially marked-up DTBook files. An external XSLT checks the xml and inserts special markes which contain both alternatives, e.g. for the vform it could be something along the lines of
<select> <proposed><vform>Sie</vform></proposed> <original>Sie</original> </select>
The preptool would simply go through all the select
elements and ask
the user whether to use the proposed change or keep the original.
- XSLT operates on xml instead of raw text
- the process becomes more clunky, as we have a call to the pipeline in the edit->preptool->check-cycle. particularly: Parens-PrepTool
Searches for a collection of vforms and allows user to verify (accept/reject). Offers two different patterns of matching vforms (presumably for old and new spelling) selectable via a check box.
Migration to XSLT should be possible. Challenge: how to handle user’s choice of pattern. The XSLT would have to be parametrized.
Finds balanced parens and wrongly oriented quotation marks. Also handles nested parens. Other than the usual prep tool this one just notifies the user of unbalanced parens and leaves the work up to the user to fix.
Unclear if this makes sense to migrate. While balanced parens could be inplemented the problem with nesting seems not such a great fit for XSLT.
Finds potential ordinal numbers and allows user to verify (accept/reject).
Migration to XSLT should be possible.
Finds potential roman numbers and allows user to verify (accept/reject).
Should be possible.
Finds potential measure numbers and allows user to verify (accept/reject).
Should be possible.
Finds potential abbreviations and allows user to verify (accept/reject).
Might be a bit hairy due to the complicated regexp but should be possible.
Finds potential urls/emails and allows user to verify (accept/reject).
Should be possible.
Finds faulty page breaks, i.e. page breaks that are the only element
between two p
, and allows user to verify (accept/reject) or to add a
class
attribute with value precedingemptyline
. In other words this
prep tool allows for three choices:
- Leave as is
- move inside
p
- move inside
p
and add classprecedingemptyline
Implementation in XSLT should be possible, however, it is not clear at the moment how the choice of three options would be handled.
Finds accents and allows user to verify (accept/reject). The difference to other preptools: once a word has been marked with a certain choice, the same choice is applied globally to all occurrences of the word in the text. This functionality would need to be implemented in the generic preptool.
The point of this is that you only get asked once. For all the other prep tools you are queried for each instance where a potential change is found. However in the case of accents the user wants to change all occurences of a certain word. This could probably be marked up in XSLT but would most likely require a dedicated tool in the editor. Not just the plain yes/no GUI.
There is an effort underway within the DAISY pipeline 2 project to create a set of what they call pre-processing steps which have a similar goal: To enrich XML with additional markup to make it ready for Braille generation. These steps will probably be based on XML technologies, i.e. XSLT and XProc and will also interact with the user, e.g. to confirm an enrichment. How this interaction will be and if it could be integrated in an oXygen editor plugin remains to be seen.