This HACKING file describes the development environment.

Copying and distribution of this file, with or without modification, are permitted in any medium without royalty provided the copyright notice and this notice are preserved.

Overview

The prep tool is used as an aid to enrich DTBook XML with markup needed to render Braille. It is implemented as a plugin to the oXygen XML editor.

Why edit the text as opposed to the document tree

The current implementation of the plugin does the markup enrichment directly in the text and does not parse the XML. This has some problems in that we have to deal with gruesome regular expressions which try to match across the XML tags (see the whole business with the skipper and the interval tree, trying to hide the XML tags). So on the one hand this is a drawback. On the other hand it keeps us from having to synchronize the document tree with the text, as while we are manipulating the document tree the user might at the same time modify the text.

Class Overview

PrepToolsPluginExtension.java

Deals with initializing the (Swing) Menu bars, tool bar, etc. It also makes sure that some menu items which are implicitly resetting the state of a document are wrapped, so that the prep tool can be notified of these events. These are the “Revert” and the “Save as” menus. Similarly when switching to another prep tool while the current one is still active we need to make sure that the user is warned and the state of the prep tool is reset.

OxygenEditGrouper.java

Group a bunch of edits so that they can be undone with one undo

DocumentMetaInfo.java

Store info such as current active prep tool, state of each prep tool or current cursor position. This is needed to reflect it in the tool bar but also for example to check if the user changed the cursor position behind the prep tools back.

PrepToolsValidatorPluginExtension.java

Not needed in favor of the MenuPlugger

AbstractPrepToolAction.java

PrepTool.java

PrepToolLoader.java

TrafficLight.java

VFormActionHelper.java

DocumentUtils.java

Allows you to do multiple replacements inside a Document (javax.swing.text.Document). This allows you to do a more intelligent modification of the text.

FileUtils.java

Implements basename

MarkupUtil.java

Implements search in the context of skippable regions. Also contains utility methods to handle xml start and end tags.

Match.java

Simple data structure for region, i.e. start and end points.

PositionMatch.java

Implements regions in a document which are immune to changes. This is done using the functionality of a swing document which implements this.

MetaUtils.java

Manage meta info in the XML. Namely enables to put the information which prep tool has been run for this document into the meta data section of the document. This is not used currently but there is a pending feature request which is implemented using this functionality.

ParensUtil.java

Helper methods for finding unbalanced parens

PropsUtils.java

Utility functions to read version info from a properties file.

RegionSkipper.java

TextUtils.java

VFormUtil.java

External Dependencies

https://github.com/bwagner/interval-tree
- as forked from https://github.com/dyoo/interval-tree
https://github.com/bwagner/wordhierarchy

Individual PrepTools

The following is a short description of each individual preptool.

XSLT Migration

Since there is an idea to implement the preptools using XSLT a discussion of feasibility of such a migration is included. The idea is to replace the preptools written in Java by one generic preptool that knows how to treat specially marked-up DTBook files. An external XSLT checks the xml and inserts special markes which contain both alternatives, e.g. for the vform it could be something along the lines of

<select>
  <proposed><vform>Sie</vform></proposed>
  <original>Sie</original>
</select>

The preptool would simply go through all the select elements and ask the user whether to use the proposed change or keep the original.

Pros:

XSLT operates on xml instead of raw text

Cons:

the process becomes more clunky, as we have a call to the pipeline in the edit->preptool->check-cycle. particularly: Parens-PrepTool

VForms

Searches for a collection of vforms and allows user to verify (accept/reject). Offers two different patterns of matching vforms (presumably for old and new spelling) selectable via a check box.

XSLT Migration

Migration to XSLT should be possible. Challenge: how to handle user’s choice of pattern. The XSLT would have to be parametrized.

Parens

Finds balanced parens and wrongly oriented quotation marks. Also handles nested parens. Other than the usual prep tool this one just notifies the user of unbalanced parens and leaves the work up to the user to fix.

XSLT Migration

Unclear if this makes sense to migrate. While balanced parens could be inplemented the problem with nesting seems not such a great fit for XSLT.

Ordinal

Finds potential ordinal numbers and allows user to verify (accept/reject).

XSLT Migration

Migration to XSLT should be possible.

Roman

Finds potential roman numbers and allows user to verify (accept/reject).

XSLT Migration

Should be possible.

Measure

Finds potential measure numbers and allows user to verify (accept/reject).

XSLT Migration

Should be possible.

Abbreviation

Finds potential abbreviations and allows user to verify (accept/reject).

XSLT Migration

Might be a bit hairy due to the complicated regexp but should be possible.

Url/Email

Finds potential urls/emails and allows user to verify (accept/reject).

XSLT Migration

Should be possible.

Pagebreak

Finds faulty page breaks, i.e. page breaks that are the only element between two p, and allows user to verify (accept/reject) or to add a class attribute with value precedingemptyline. In other words this prep tool allows for three choices:

Leave as is
move inside p
move inside p and add class precedingemptyline

XSLT Migration

Implementation in XSLT should be possible, however, it is not clear at the moment how the choice of three options would be handled.

Accent

Finds accents and allows user to verify (accept/reject). The difference to other preptools: once a word has been marked with a certain choice, the same choice is applied globally to all occurrences of the word in the text. This functionality would need to be implemented in the generic preptool.

XSLT Migration

The point of this is that you only get asked once. For all the other prep tools you are queried for each instance where a potential change is found. However in the case of accents the user wants to change all occurences of a certain word. This could probably be marked up in XSLT but would most likely require a dedicated tool in the editor. Not just the plain yes/no GUI.

Future work

There is an effort underway within the DAISY pipeline 2 project to create a set of what they call pre-processing steps which have a similar goal: To enrich XML with additional markup to make it ready for Braille generation. These steps will probably be based on XML technologies, i.e. XSLT and XProc and will also interact with the user, e.g. to confirm an enrichment. How this interaction will be and if it could be integrated in an oXygen editor plugin remains to be seen.

Files

HACKING.org

Latest commit

History

HACKING.org

File metadata and controls

Overview

Why edit the text as opposed to the document tree

Class Overview

PrepToolsPluginExtension.java

OxygenEditGrouper.java

DocumentMetaInfo.java

PrepToolsValidatorPluginExtension.java

AbstractPrepToolAction.java

PrepTool.java

PrepToolLoader.java

TrafficLight.java

VFormActionHelper.java

DocumentUtils.java

FileUtils.java

MarkupUtil.java

Match.java

PositionMatch.java

MetaUtils.java

ParensUtil.java

PropsUtils.java

RegionSkipper.java

TextUtils.java

VFormUtil.java

External Dependencies

Individual PrepTools

XSLT Migration

Pros:

Cons:

VForms

XSLT Migration

Parens

XSLT Migration

Ordinal

XSLT Migration

Roman

XSLT Migration

Measure

XSLT Migration

Abbreviation

XSLT Migration

Url/Email

XSLT Migration

Pagebreak

XSLT Migration

Accent

XSLT Migration

Future work