Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantic DOM (SDOM) API & Search API #87

Open
svanteschubert opened this issue Apr 6, 2021 · 1 comment
Open

Semantic DOM (SDOM) API & Search API #87

svanteschubert opened this issue Apr 6, 2021 · 1 comment
Assignees
Labels
help wanted Extra attention is needed

Comments

@svanteschubert
Copy link
Contributor

There have to be a high-level API that abstracts from the implementation details of the XML DOM as Simple API & org.odftoolkit.odfdom.doc API but now in addition has to be as well compatible with the new change (collaboration) approach
https://tdf.github.io/odftoolkit/odfdom/operations/operations.html
(every operation has to be mappable to one or more API calls - best in a generic way (of naming & finding these methods))

This new operation/change approach allows every ODT document to be transformed into an equivalent list of user operations/changes as if the document has been just created from top to bottom by the user. Such a change list in JSON can be currently created by any ODFDOM user of the latest repository (or BETA) calling the JAR with a JDK >=9 (tested with JDK11)

java -jar odfdom-java-1.0.0-BETA1-jar-with-dependencies.jar <USER'S ODT>

Or can be found in the following as an [example]:(https://github.com/tdf/odftoolkit/blob/master/docs/docs/presentations/character-styles.odt) and JSON.

The reason and advantage of switching from a final state (zipped ODT document state) to a more fine granular user-change concept is to be able to answer the most important question of collaboration & to be able to do a merge & synchronize: "What have you changed?".

On top of this higher level SDOM API will be in addition some Search API to query the content of one or more document(s), e.g.

  • return all visible text (to be translated)
  • return a reference to all tables
  • what changes had been made in a period of time (e.g. during my vacations)
@svanteschubert
Copy link
Contributor Author

Every ODF user is aware of semantics such as a table, paragraph, image, character, etc.
These ODF semantic pieces known to users consists of more than one XML pieces (ie. XML nodes) described by the ODF XML grammar. In other words, XML nodes described by the ODF XML grammar can be abstracted to larger puzzle pieces, which are already known to the end-users or in general exist in any rich format file format.

Therefore, the idea is to define upon the ODF grammar these semantic puzzle pieces.

First, the pattern that identifies the beginning of a new semantic entity, e.g. the XML table:table element for the start of a table.
By this declarative approach, it is desired to generate a SAX parser that transforms an ODT into a sequence of equivalent changes. In general, the XML grammar is being transformed into a set of method calls, representing the possible user changes.
But there are more user changes - like deletions or some modifications (e.g. insertColumn) that will never be created when transforming an exiting ODT document to a list of changes. For this reason, some subsets of XML are able to be modified by operations, for instance, "insertColumn()" at a table.
The idea is to define in a declarative way the XML change pattern of "insertColumn()" upon the ODF XML grammar to generate source code from it, which allows the transformation from operations to ODT.
In the end, a bi-directional transformation from ODT to operations and back should be possible.

My goal is to exchange the existing manually written feature-spaghetti code (every ODF feature in the same single SAX parser) within odfdom/src/main/java/org/odftoolkit/odfdom/changes with a generic version making maintenance possible in the future even for multiple different programming languages (generate Java in the first place, but e.g. C++ generation (or other languages as RUST) should be possible based on the same declarative approach).

@svanteschubert svanteschubert self-assigned this Apr 7, 2021
@svanteschubert svanteschubert added the help wanted Extra attention is needed label Apr 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant