-
Notifications
You must be signed in to change notification settings - Fork 19
annotation
- Introduction
- Getting Started
- Defining an Annotation Task
- Annotation Workflow
- Adjudication Workflow
- IAA calculator
Once a valid task definition is provided, annotators can start annotation task by loading up the definition, using [File] > [New Task Definition]
.
MAE will generate tables corresponding to tags defined in the DTD in the bottom half of the interface.
You'll notice that each extent tag type has been assigned with a color.
These colors are randomly generated at the load and automatically assigned to visualize extent tags on the primary text for annotators.
Annotators can turn on or off these colors as they want by clicking on the color indicators, or (from v2.1) can change the color with right-clicking.
Also you'll notice that an extra table, All Extents
, is there.
This table will list up all extent tags marked up and is useful for some reasons, we'll talk about them in the below.
On the link-tag side, instead of colors, MAE will italicizing text spans of extent tags (arguments) related to each link tag type, which is also can be toggled using checkbox on the tags.
MAE can load up a plain text file as a primary document for annotation or an existing XML annotation task.
To load up a document use [File] > [Open Document]
.
An existing XML file can be open as an annotation work, only when it matches the task definition name currently loaded (the root node of the XML). Otherwise MAE will open the XML as a plain text. This brings up two important points to note;
- MAE will accept any XML file with a matching task name as a valid annotation work, even the actual is tag instances are invalid because of revisions to the specification or something else. This WILL cause errors while running the program.
- For any reason, when someone needs to open old annotation work with a newer DTD, old files need to be modified manually to match not only the task name but, mor importantly, the contents of annotation instances.
Also, to handle Unicode properly, MAE decodes all files with UTF-8 codes. Make sure the file is encoded with UTF-8.
The document will be showed up at the top half of the interface.
To create a consuming tag, first select text span using mouse cursor. With the text span highlighted, right click on the text to open a context menu of possible entity tag types. Selecting the desired type from the menu will create a tag instance, and it will immediately populate the corresponding table in the bottom.
MAE will automatically generate id
value and fill in spans
and text
.
To create a consuming tag with a discontiguous text span, annotators must use a special mode, enabled by [Mode] > [Switch to discontiguous span selection mode]
.
In this "multi-span" mode, annotators can select text spans discontiguously, until they right-click to create a tag.
Span selection can be undone or cleared out using the context menu.
And creating a tag in the multi-span mode will make MAE return to the normal mode.
Alternatively, use [Mode] > [Return to normal mode]
to exit the mode.
To create a non-consuming tag, annotators can use [Tags]
menu or the context menu on the document.
Non-consuming tags are mark-ups for hidden or omitted entities that do not anchor on any text span. Recall that not all extent tag types can be non-consuming. It's up to the task definition which tag can be non-consuming.
Like creating non-consuming tags, annotators can create a link tag using the [Tags]
menu or context menu.
However, link tags created in this way won't have any any argument linked, thus annotators need to manually associate arguments one by one.
Alternatively, annotators can use another dedicated mode to create a link tag with its arguments specified. The "argument selection" mode is enabled by [Mode] > [Switch to argument selection mode]
.
In this mode, annotators can select arguments as they want by clicking at the text span that the arguments anchored on. After they have selected enough extent tags, calling the context menu will bring up the link creation menu, which again bring up link creation dialog when selected. Within the link creation dialog, annotators can specify arguments before they confirm the creation of a new link tag.
Like the "multi-span" mode, selections can be undone or cleared, exiting mode is available in the [Mode]
menu, and creating a link tag will automatically exit to the normal mode.
Annotators can select multiple rows from the table. The tables are also sortable column by column.
So, by using All Extent
table, an annotator can select multiple extent tags of different types as they wants - ctrl (in Windows and GNU/Linux) or ⌘/cmd (in Mac OSX) will let them select row by row.
When the selection is done, calling the context menu from the table will bring up the link creation menu, just like from the "argument selection" mode.
When a new tag is created, MAE will assign any default values for its attributes, if specified in DTD. Then annotators can annotate attribute values, using the table in the bottom. If an attribute has a closed set of possible values, MAE will use a drop-down menu for the attribute, allowing annotators can select from valid values.
Selecting tags to edit can be done in two way;
- select text span where the tags to edit are anchored on (that is, non-consuming or link tags cannot be selected in this way) or
- select rows corresponding to the tags to edit in the table.
Annotators can use the bottom table to edit any attribute value, except for id
attribute and any text
attributes (text
for extent tags, xxxText
for link tags).
id
attribute values are automatically generated when tags are created, to avoid having duplicate ids.
text
attributes are automatically updated whenever the span of the tag is updated.
Double-click on a cell will start editing its value, if it's editable.
Otherwise the corresponding tag is highlighted in the document area.
(Note that All Extents
table is not editable at all.)
Annotators need to pay a special attention when editing spans
attribute, since the value needs to be in a specific format.
Use ASCII tilde (~) to connect start and end offsets as a single span, and use comma (,) to delimit discontiguous spans.
Also note that the start offsets are inclusive, while the end offsets are exclusive (e.g. 3~4
indicate a single character offset [3])
To manually set an argument of a link tag after it is created, annotators can
- directly edit the
xxxId
attribute value (MAE will validate that's a valid ID) or - use the context menu when an extent tag is selected either from the text area or in the table.
The latter will bring up a dialog to specify which link tag that takes the extent tag as which argument.
Deletion operation is only achievable through the context menu after selecting desired tags, either from the table or on the text.
MAE will be showing asterisk *
in front of the file name if there are unsaved changes.
Also, if an user tries to close the annotation work without saving changes, MAE will alert them.
However, currently MAE does not support auto-save or recovery feature. So make saves frequently.
Take a look at this [sample XML] (https://github.com/keighrim/mae-annotation/blob/master/samples/miller.xml) for a full fledged annotation sample saved from MAE.