diff --git a/README.md b/README.md
index 444bbda..4266c01 100644
--- a/README.md
+++ b/README.md
@@ -32,10 +32,18 @@ Enabling these lightweight, transparent and declarative "logical layers" written
- [SchXSLT][schxslt] - ISO Schematron / community enhancements
- [XSpec][xspec] - XSpec - XSLT/XQuery unit testing
+As an alternative to Morgana, users are also invited to test [XML Calabash 3][xmlcalabash3]. At time of writing, this release is too new to be incorporated into the project, but appears promising as an alternative platform for everything demonstrated here.
+
These are open-source projects in support of W3C- and ISO-standardized technologies. Helping to install, configure, and make these work seamlessly, so users do not have to notice, is a goal of this project.
If this software is as easy, securable and performant as we hope to show, it might be useful not only to XML-stack developers but also to others who wish to cross-check their OSCAL data or software supporting OSCAL by comparison with another stack.
+### XProc testbed
+
+XProc developers, similarly, may be interested in this project as a testbed for performance and conformance testing.
+
+This deployment is also intended to demonstrate conformance to relevant standards and external specifications, not just to APIs and interfaces defined by tool sets.
+
### Projects -- current and conceived
See the [Projects folder](./projects/) for current projects. Projects now planned for deployment in this repository include:
@@ -48,18 +56,22 @@ See the [Projects folder](./projects/) for current projects. Projects now planne
- Find and demonstrate modeling or conformance issues in schemas or processors
- Conversely, demonstrate conformance of validators and design of models
- Showcase differences between valid and invalid documents, especially edge cases
- - [`oscal-import`](projects/oscal-import/) - produce OSCAL from PDF via HTML and NIST STS formats - a demonstration showing conversion of a 'high-touch' document into OSCAL, mapping its structures
+ - [`cprt-import`](projects/cprt-import/) - produce OSCAL from a raw JSON feed (not OSCAL) - demonstrating conversion of NIST CPRT [NIST SP 800-171](https://csrc.nist.gov/projects/cprt/catalog#/cprt/framework/version/SP_800_171_3_0_0/home) into OSCAL
+ - [`FM6-22-import`](projects/FM6-22-import/) - produce OSCAL from PDF via HTML and NIST STS formats - a demonstration showing conversion of a 'high-touch' document into OSCAL, namely US Army Field Manual 6-22 Chapter 4 "Developing Leadership", mapping its structures into STS and OSCAL formats
- `batch-validate` validate OSCAL in batches against schemas and schema emulators
- `index-oscal` - produce indexes to information encoded in OSCAL
TODO: update this list
+
READERS: [anything to add?][repo-issues]
Applications in this repository may occasionally have general use outside OSCAL; users who find any of its capabilities should be generalized and published separately please [create a Github Issue][repo-issues].
### Organization
-Folders outside `projects` including `lib`, `smoketest`, `project-template`, `testing`, `icons` and (hidden) `.github` folders serve the repository as a whole; specific applications are all to be found among [projects](./projects).
+Folders outside `projects` including `lib`, `smoketest`, `project-template`, `testing`, `icons` and (hidden) `.github` folders serve the repository as a whole; specific applications are all to be found among [projects](./projects/).
+
+An exception to this is the [tutorial](./tutorial/), which is a project, but also uses the projects as its source, so is kept apart from the other applications as a "global" project.
[The `lib` directory](./lib) comes bare bones - it has only its readme, a configuration file and a couple of utility pipelines. This library is populated by the [installation script](./setup.sh), and (once the basic setup is done) by running the pipelines.
@@ -87,7 +99,7 @@ The software in this repository is at varying levels of maturity. Many styleshee
At the same time, the libraries we use (Morgana, Saxon and others) are themselves at various levels of maturity (Saxon in particular having been field-tested for over 20 years). And both particular initiatives and the code repository as a whole follow an incremental development model. Things left as good-enough-for-now are regarded as being good enough, until experience shows us it is no longer so. Punctuated equilibrium is normal. New contrivances are made of old and reliable parts.
-Assume the worst, hope for the best, and test.
+*Assume the worst, hope for the best, and test.*
Cloning the repository is encouraged and taken as a sign of success. So is any participation in testing and development activities.
@@ -119,7 +131,7 @@ Assuming 'TODO' items are addressed and these markers disappear, the git history
Innovations
-As of mid-2024, we believe some aspects of this initiative are innovative or unusual, even as it stands on foundations laid by others. Please let us know of relevant prior art, or independent invention, especially if it anticipates the work here.
+As of mid-2024, we believe some aspects of this initiative are innovative or unusual, even as it stands on foundations laid by others. Please let us know of relevant prior art, or independent invention, especially if it anticipates the work here. It is to be hoped that some of these applications are "obvious" and not as new as we think at least in conception.
#### Pipelines for “self setup”
@@ -165,6 +177,8 @@ This makes cloning and further development easier.
## Where to start
+One way to start is to dive into the [Tutorial](tutorial/readme.md). This introduction to XProc does not assume prior XML expertise, only a willingness to learn.
+
OSCAL developers
@@ -196,7 +210,9 @@ An [XProc tutorial](tutorial/sequence/lesson-sequence.md) is offered on this sit
### Installation instructions
-Note: if you already have Morgana XProc III installed, you should be able to use it, appropriately configured, to run any pipeline in the repository. But local installation is also easy and clean.
+Needed only if you do not already have an XProc 3 engine such as Morgana or XML Calabash. If you already have support for XProc 3, consider using your available tooling, instead or in addition to the runtime offered.
+
+(Any bugs you find in doing so can be addressed and the entire repository "hardened" thereby -- one of the beneficial network effects of multiple implementations of a standard.)
*Platform requirements*: Java, with a `bash` shell for automated installation. Only Java is required if you can install manually.
@@ -280,11 +296,11 @@ See the [projects/](./projects/) directory with a list of projects - each should
Or jump to these projects:
+- [XProc Tutorial](tutorial/readme.md) provides step-by-step instructions and play-by-play commentary.
- [Schema Field Tests](./schema-field-tests) - Testing whether OSCAL schemas correctly enforce rules over data (with surprises)
- [OSCAL Profile Resolution](./profile-resolution) - converting an OSCAL profile (representing a baseline or overlay) into its catalog of controls
-- [./projects/oscal-import/](./projects/oscal-import/) - Produce OSCAL from a PDF source via HTML and XML conversions
+- Produce OSCAL from other data formats: from raw JSON source in [CPRT import](projects/CPRT-import/); or from PDF source via HTML and XML conversions[FM6-22 import](projects/FM6-22-import)
-
Any XProc3 pipeline can be executed using the script `xp3.sh` (`bash`) or `xp3.bat` (Windows CMD). For example:
```bash
@@ -309,14 +325,14 @@ See the [House Rules](./house-rules.md) for more information.
Drag and drop (Windows only)
-Optionally, Windows users can use a batch file command interface, with drag-and-drop functionality in the GUI (graphical user interface, your 'Desktop').
+[Optionally, Windows users can use a batch file command interface](https://github.com/usnistgov/oscal-xproc3/discussions/18), with drag-and-drop functionality in the GUI (graphical user interface, your 'Desktop').
In the File Explorer, try dragging an icon for an XPL file onto the icon for `xp3.bat`. (Tip: choose a pipeline whose name is in all capitals, as in 'ALL-CAPS.xpl' — explanation below.)
Gild the lily by creating a Windows shortcut to the 'bat' file. This link can be placed on your Desktop or in another folder, ready to run any pipelines that happen to be dropped onto it. Renaming the shortcut and changing its icon are also options. Some icons for this purpose are provided [in the repository](./icons/).
-TODO: Develop and test [./xp3.sh](./xp3.sh) so it too offers this or equivalent functionality on \*nix or Mac platforms - AppleScript! - lettuce know 🥬 if you want or can do this
-
+TODO: Develop and test [./xp3.sh](./xp3.sh) (or scripts to come) so it too offers this or equivalent functionality on \*nix or Mac platforms - AppleScript! - lettuce know 🥬 if you want or can do this
+
## Testing
@@ -375,11 +391,23 @@ Morgana and Saxon both require Java, as detailed on their support pages. SchXSLT
See [THIRD_PARTY_LICENSES.md](./THIRD_PARTY_LICENSES.md) for more.
+As noted above, however, all software is also conformant with relevant open-source language specifications, and should deliver the same results, verifiably, using other software that follows the same specifications, including XProc and XSLT processors yet to be developed.
+
XProc 3.0 aims to be platform- and application-independent, so one use of this project will be to test and assess portability across environments supporting XProc.
## XProc platform acknowledgements
-With the authors of incorporated tooling, the many contributors to the XProc and XML stacks underlying this functionality are owed thanks and acknowledgement. These include Norman Walsh, Achim Berndzen and the developers of XProc versions 1.0 and 3.0; developers of embedded commodity parsers and processers such as Java Xerces, Trang, and Apache FOP (to mention only three); and all developers of XML, XSLT, and XQuery especially unencumbered and open-source. Only an open, dedicated and supportive community could prove capable of such a collective achievement.
+With the authors of incorporated tooling, the many contributors to the XProc and XML stacks underlying this functionality are owed thanks and acknowledgement. These include
+
+- [Henry Thompson](https://www.xml.com/pub/a/ws/2001/02/21/devcon1.html) and other pioneers of XML pipelining on a standards basis
+- Norman Walsh
+- Norm's fellow committee members and developers of XProc versions 1.0 and 3.0
+- Developers of embedded commodity parsers and processers such as Java Xerces, Trang, and Apache FOP (to mention only three)
+- All developers of XML, XSLT, and XQuery technologies and applications, especially unencumbered and open-source
+
+Only an open, dedicated and supportive community could prove capable of such a collective achievement.
+
+This work is dedicated to the memory of Michael Sperberg-McQueen and to all his students, past and future.
---
@@ -400,8 +428,8 @@ This README was composed starting from the [NIST Open Source Repository template
[oscal-xslt]: https://github.com/usnistgov/oscal-xslt
[oscal-cli]: https://github.com/usnistgov/oscal-cli
[xslt3-functions]: https://github.com/usnistgov/xslt3-functions
-
[xdm3]: https://www.w3.org/TR/xpath-datamodel/
+[xmlcalabash]: https://github.com/xmlcalabash/xmlcalabash3
[xslt3]: https://www.w3.org/TR/xslt-30/
[xproc]: https://xproc.org/
[xproc-specs]: https://xproc.org/specifications.html
diff --git a/projects/FM6-22-import/PRODUCE_FM6-22-chapter4.xpl b/projects/FM6-22-import/PRODUCE_FM6-22-chapter4.xpl
index a89009c..bd23864 100644
--- a/projects/FM6-22-import/PRODUCE_FM6-22-chapter4.xpl
+++ b/projects/FM6-22-import/PRODUCE_FM6-22-chapter4.xpl
@@ -29,8 +29,9 @@
for demonstration or diagnostics -->
-
-
+
diff --git a/tutorial/PRODUCE-TUTORIAL-PREVIEW.xpl b/tutorial/PRODUCE-TUTORIAL-PREVIEW.xpl
index 36d1cc2..095151f 100644
--- a/tutorial/PRODUCE-TUTORIAL-PREVIEW.xpl
+++ b/tutorial/PRODUCE-TUTORIAL-PREVIEW.xpl
@@ -140,7 +140,7 @@ tr:nth-child(even) { background-color: gainsboro }
th { width: clamp(10em, auto, 40em) }
td { width: clamp(10em, auto, 40em); border-top: thin solid grey }
-section.unit { width: clamp(45ch, 50%, 75ch); padding: 0.8em; outline: thin solid black; margin: 0.6em 0em }
+section.unit { width: clamp(45ch, 100%, 75ch); padding: 0.8em; outline: thin solid black; margin: 0.6em 0em }
section.unit h1:first-child { margin-top: 0em }
.observer { background-color: honeydew ; grid-column: 2 }
.maker { background-color: seashell ; grid-column: 3 }
@@ -160,19 +160,22 @@ span.wordcount.over { color: darkred }
-
+
+
+
+
-
+
-
+
+
diff --git a/tutorial/punchlist.md b/tutorial/punchlist.md
index c714946..f2fa6fc 100644
--- a/tutorial/punchlist.md
+++ b/tutorial/punchlist.md
@@ -24,8 +24,8 @@ To add to the production pipeline, edit PRODUCE-TUTORIAL-MARKDOWN.xpl
- review phase:
- Commends Day? (week?) go through all the comments
consider factoring out into p:documentation / tooling
- - 101 sequence is inspection and observation (only)
- - 102 sequence is hands-on
+ - Observer sequence is inspection and observation (only)
+ - Maker sequence is hands-on
- all 'Goals' in sequence, all 'Resources' in sequence, etc
- where can we default e.g. `with-input` in place of `with-input[@port='source']` ? test all these ...
- Review and normalize usage of 'i', 'b', 'em' and other inline elements?
@@ -337,5 +337,20 @@ Note - in some places there may be 'road work' going on
Here we should start with a proposed visiting order?
+### XProc Synopsis
+
+Input ports bound - p:document | p:inline
+ top-level
+ per step
+ inlines
+Output ports defined
+Options defined
+Imports
+
+At a glance:
+- all load and document/@href
+- all store/@href
+
+
diff --git a/tutorial/readme.md b/tutorial/readme.md
index d8ab1e4..7b2dcbd 100644
--- a/tutorial/readme.md
+++ b/tutorial/readme.md
@@ -4,7 +4,7 @@ This is work in progress towards an XProc 3.0 (and 3.1) tutorial or set of tutor
Coverage here is not a substitute for project documentation - the tutorial relies on projects in the repo for its treatments - but an adjunct to it for beginners and new users who wish for guidance and information on XProc that they are not likely to find for themselves.
-In its current form, only introductory materials are offered. The framework is easily extensible to cover more topics, and an XProc-based tutorial production system is part of the demonstration.
+In its current form, only the first introductory materials are offered. The framework is easily extensible to cover more topics, and an XProc-based tutorial production system is part of the demonstration. But the approach needs to be tested more before it is extended.
Tutorial exercises can be centered on OSCAL-oriented applications but the learning focus will be XProc 3.0/3.1.
@@ -20,7 +20,7 @@ Follow the tutorial by reading the files published in the repository, or by copy
First and foremost this is a "practicum" or *hands-on* introduction that encourages readers not only to follow along, but to try things out, practice and learn by interactive observation.
-Otherwise, the tutorial is designed to support multiple different approaches suitable for different learners and needs - both learning styles, and use cases ("user stories") as described below. Develop an approach that works for you by moving at your own speed and skipping, skimming or delving more deeply into topics and problems of interest.
+Otherwise, the tutorial is designed to support multiple different approaches suitable for different learners and needs - both learning styles and goals as described below. Develop an approach that works for your case by moving at your own speed and skipping, skimming or delving more deeply into topics and problems of interest.
Each topic ("Lesson") in a sequence offers a set of Lesson Units around a common problem area or theme, leveraging projects in the repository to provide problems and solutions with working pipelines to run and analyze.
@@ -53,9 +53,9 @@ To enable readers to cater to their own needs, the tutorial offers these **track
Since the different tracks are arranged along the same topics, the treatments are also suitable for groups who wish to work any or all tracks collaboratively.
-If you want a no-code experience, skip the Maker track and skim the Observer track, but do not skip looking at the code base, accepting that much will remain mysterious.
+If you want a no-code experience, read the Learner track, skip the Maker track and skim the Observer track. Keep in mind that you might have to run pipelines, if only to see their outputs.
-If security concerns preclude you from running locally, post us an Issue and the dev team will investigate options including a container-based distribution of some nature. The beauty and simplicity of 'bare bones' however is what recommends it to us and you.
+If for any reason you can't run XProc or Java, post us an Issue and the dev team will investigate options including a container-based distribution of some nature. The simplicity of 'bare bones' however recommends it to us and you.
### Observer Track
@@ -83,7 +83,7 @@ If you are a tactile learner with no patience for reading, you can skim through
In parallel with the other two tracks, the Learner track offers all readers more explanation and commentary, in greater depth and with more links.
-Note that the Learner track represents the views of one still learning, so it is subject to change and refinement - most especially if you find things in it that are in need of clarification or correction.
+Note that the Learner track itself represents the views of one still learning, so it is subject to change and refinement - most especially if you find things in it that are in need of clarification or correction.
### Easter eggs
@@ -134,11 +134,11 @@ See the top-level pipelines for current capabilities. At time of writing:
[PRODUCE-TUTORIAL-MARKDOWN.xpl](PRODUCE-TUTORIAL-MARKDOWN.xpl) produces a set of Markdown files, writing them to the `sequence` directory.
-[PRODUCE-TUTORIAL-TOC.xpl]() produces the [Tutorial Table of Contents](sequence/lesson-sequence.md)
+[PRODUCE-TUTORIAL-TOC.xpl](PRODUCE-TUTORIAL-TOC.xpl) produces the [Tutorial Table of Contents](sequence/lesson-sequence.md)
[PRODUCE-TUTORIAL-PREVIEW.xpl](PRODUCE-TUTORIAL-PREVIEW.xpl) produces a single [preview tutorial on one HTML page](tutorial-preview.html)
-[PRODUCE-PROJECTS-ELEMENTLIST.xpl] produces an [index to XProc elements appearing in pipelines](sequence/element-directory.md) under discussion - read about it in the lessons
+[PRODUCE-PROJECTS-ELEMENTLIST.xpl](PRODUCE-PROJECTS-ELEMENTLIST.xpl) produces an [index to XProc elements appearing in pipelines](sequence/element-directory.md) under discussion - read about it in the lessons
# Leave your tracks
diff --git a/tutorial/sequence/Lesson01/acquire_101.md b/tutorial/sequence/Lesson01/acquire_101.md
index cb709dd..3019804 100644
--- a/tutorial/sequence/Lesson01/acquire_101.md
+++ b/tutorial/sequence/Lesson01/acquire_101.md
@@ -54,7 +54,7 @@ After reading and reviewing these documents, perform the setup on your system as
After running the setup script, or performing the installation by hand, make sure you can run all the smoke tests successfully.
-As noted in the docs, if you happen already to have [Morgana XProc III](https://www.xml-project.com/morganaxproc-iiise.html), you do not need to download it again. Try skipping straight to the smoke tests. You can use a runtime script `xp3.sh` or `xp3.bat` as a model for your own, and adjust. Any reasonably recent version of Morgana should function if configured correctly, and we are interested if it does not.
+As noted in the docs, if you happen already to have [Morgana XProc III](https://www.xml-project.com/morganaxproc-iiise.html), you do not need to download it again. Try skipping straight to the smoke tests. You can use a runtime script `xp3.sh` or `xp3.bat` as a model for your own, and adjust. Any reasonably recent version of Morgana should function if configured correctly, and we are interested if it does not.
### Shortcut
@@ -96,7 +96,7 @@ Such a script itself must be “vanilla” and generic: it simply invoke
### When running from a command line
-As simple examples, these scripts show only one way of running XProc. Keep in mind that even simple scripts can be used in more than one way.
+As simple examples, these scripts show only one way of running XProc. Keep in mind that even simple scripts can be used in more than one way.
For example, a pipeline can be executed from the project root:
@@ -104,13 +104,13 @@ For example, a pipeline can be executed from the project root:
$ ./xp3.sh smoketest/TEST-XPROC3.xpl
```
-Alternatively, a pipeline can be executed from its home directory, for example if currently in the `smoketest` directory (note the path to the script):
+Alternatively, a pipeline can be executed from its home directory, for example if currently in the `smoketest` directory (note the path to the script):
```
$ ../xp3.sh TEST-XPROC3.xpl
```
-This works the same ways on Windows, with adjustments:
+This works the same ways on Windows, with adjustments:
```
> ..\xp3 TEST-XPROC3.xpl
@@ -120,7 +120,7 @@ This works the same ways on Windows, with adjustments:
Windows users (and others to varying degrees) can set up a drag-and-drop based workflow – using your mouse or pointer, select an XProc pipeline file and drag it to a shortcut for the executable (Windows batch file). A command window opens to show the operation of the pipeline. See the [README](../../README.md) for more information.
-It is important to try things out since any of these methods can be the basis of a workflow.
+It is important to try things out since any of these methods can be the basis of a workflow.
For the big picture, keep in mind that while the command line is useful for development and demonstration – and however familiar XProc itself may become to the developer – to a great number of people it remains obscure, cryptic and intimidating if not forbidding. Make yourself comfortable at the command line!
diff --git a/tutorial/sequence/Lesson01/acquire_102.md b/tutorial/sequence/Lesson01/acquire_102.md
index fefefd5..8581cae 100644
--- a/tutorial/sequence/Lesson01/acquire_102.md
+++ b/tutorial/sequence/Lesson01/acquire_102.md
@@ -9,7 +9,7 @@
## Goals
* Look at some pipeline organization and syntax on the inside
-* Success and failure invoking XProc pipelines: an early chance to “learn to die” gracefully (to use the gamers' idiom).
+* Success and failure invoking XProc pipelines: making friends with tracebacks.
## Prerequisites
diff --git a/tutorial/sequence/Lesson01/acquire_599.md b/tutorial/sequence/Lesson01/acquire_599.md
index 6023df5..0d6b6ea 100644
--- a/tutorial/sequence/Lesson01/acquire_599.md
+++ b/tutorial/sequence/Lesson01/acquire_599.md
@@ -6,22 +6,36 @@
# 599: Meeting XProc
+## Goals
+
+Offer some more context; help reduce the intimidation factor.
+
+XProc is not a simple thing, but a way in. The territory is vast, but the sky is always above us.
+
## Resources
[A Declarative Markup Bibliography](https://markupdeclaration.org/resources/bibliography) is available on line for future reference on this theoretical topic.
## Some observations
-Because it is now centered on *pipelines* as much as on files and software packages, dependency management is different from other technologies including Java and NodeJS – how so?
+Because it is now centered on *pipelines* as much as on files and software packages, dependency management when using XProc is different from other technologies including Java and NodeJS – how so?
MorganaXProc-III is implemented in Scala, and Saxon is built in Java, but otherwise distributions including the SchXSLT and XSpec distributions consist mainly of XSLT. This is either very good (with development and maintenance requirements in view), or not good at all.
-Which is it, and what are the determining variables that tell you XProc is a good fit? How much of this is due to the high-level, abstracted nature of [4GLs](https://en.wikipedia.org/wiki/Fourth-generation_programming_language) including both XSLT 3.1 and XProc 3.0? Prior experience with XML-based systems and the problem domains in which they work well is probably a factor. How much are the impediments technical, and how much are they due to culture?
+If not using Morgana but another XProc engine (at time of writing, XML Calabash 3 has been published in alpha), there will presumably be analogous arrangements: contracts between the tool and its dependencies, software or components and capabilities bundled and unbundled.
+
+So does this work well, on balance, and what are the determining variables that tell you XProc is a good fit for data processing, whether high touch, or at scale? How much of this is due to the high-level, abstracted nature of [4GLs](https://en.wikipedia.org/wiki/Fourth-generation_programming_language) including both XSLT 3.1 and XProc 3.0? Prior experience with XML-based systems and the problem domains in which they work well is probably a consideration. How much are the impediments technical, and how much are they due to culture and perceptions?
+
+Will it always be that a developer determined to use XSLT will find a way, whereas a developer determined not to, will find a way to refuse it? XProc in 2024 seems slow in adoption – maybe because everyone who would want it, already has a functional equivalent in place.
+
+This being said, going forward the principle remains that we gain an implicit advantage when we find ways of exploiting technology opportunities that our peers and competitors have decided to neglect. In essence, by leaving XML, XSLT and XProc off the table, developers who choose not to use it may actually be giving easy money to developers who are able to adopt and exploit this externality, where it works.
+
+It's all about the tools. Find ways to support your open-source developer and the software development operations who offer free tools and services.
## Declarative markup in action
Considerable care is taken in developing these demonstrations to see to it that the technologies on which we depend, notably XProc and XSLT but not limited to these, are both nominally and actually conformant to externally specified standard technologies, i.e. XProc and XSLT respectively (as well as others), and reliant to the greatest possible extent on well-documented and accessible runtimes.
-It is a tall order to ask that any code base should be both easy to integrate and use with others, and at the same time, functionally complete and self-sufficient. Of these two, we are lucky to get one, even if we are thoughtful enough to limit ourselves to building blocks. Because the world is complex, we are always throwing in one or another new dependency, along with new rule sets. The approach enabled by XML and openly-specified supporting specifications is to work by making everything transparent as possible. We seek for clarity and transparency at all levels (so nothing is downloaded behind the scenes, for example) while also documenting as thoroughly as we can, including with code comments.
+Is it too much to expect that any code base should be both easy to integrate and use with others, and at the same time, functionally complete and self-sufficient? Of these two, we are lucky to get one, even if we are thoughtful enough to limit ourselves to building blocks. Because the world is complex, we are always throwing in one or another new dependency, along with new rule sets. The approach enabled by XML and openly-specified supporting specifications is to work by making everything transparent as possible. We seek for clarity and transparency at all levels (so nothing is downloaded behind the scenes, for example) while also documenting as thoroughly as we can, including with code comments.
Can any code base be fully self-explanatory and self-disclosing? Doubtful, even assuming those terms are meaningful. But one can try and leave tracks and markers, at least. We call it “code” with the hope and intent that it should be amenable to and rewarding of interpretation.
diff --git a/tutorial/sequence/Lesson02/walkthrough_101.md b/tutorial/sequence/Lesson02/walkthrough_101.md
index 3f43867..517531c 100644
--- a/tutorial/sequence/Lesson02/walkthrough_101.md
+++ b/tutorial/sequence/Lesson02/walkthrough_101.md
@@ -73,7 +73,7 @@ Each of the test pipelines exercises a simple sequence of operations. Open any X
The aim here is demystification. Understand the parts to understand the whole. Reading the element names also inscribes them in memory circuits where they can be recovered.
-### TEST-XPROC3
+### [TEST-XPROC3](../../../smoketest/TEST-XPROC3.xpl)
Examine the pipeline [TEST-XPROC3.xpl](../../../smoketest/TEST-XPROC3.xpl). It breaks down as follows:
@@ -83,7 +83,7 @@ Examine the pipeline [TEST-XPROC3.xpl](../../../smoketest/TEST-XPROC3.xpl). It b
When you run this pipeline, the `CONGRATULATIONS` document given in line will be echoed to the console, where designated outputs will appear if not otherwise directed.
-### TEST-XSLT
+### [TEST-XSLT](../../../smoketest/TEST-XSLT.xpl)
[This pipeline](../../../smoketest/TEST-XSLT.xpl) executes a simple XSLT transformation, in order to test that XSLT transformations can be successfully executed.
@@ -97,7 +97,7 @@ If your pipeline execution can't process the XSLT (perhaps Saxon is not installe
Errors in XProc are reported by the Morgana engine using XML syntax. Among other things, this means they can be captured and processed in pipelines.
-### TEST-SCHEMATRON
+### [TEST-SCHEMATRON](../../../smoketest/TEST-SCHEMATRON.xpl)
Schematron is a language used to specify rules to apply to XML documents. In this case a small Schematron is applied to a small XML.
@@ -105,7 +105,7 @@ Schematron is a language used to specify rules to apply to XML documents. In thi
* `p:validate-with-schematron` – This is an XProc step specifically for evaluating an XML document against the rules of a given Schematron. Like the TEST-XPROC3 and TEST-XSLT` pipelines, this one presents its own input, given as a literal XML document given in the pipeline document (using `p:inline`). A setting on this step provides for it to throw an error if the document does not conform to the rules. The Schematron file provided as input to this step, [src/doing-well.sch](../../../smoketest/src/doing-well.sch), gives the rules. This flexible technology enables easy testing of XML against rule sets defined either for particular cases in particular workflows, or for entire classes or sets of documents.
* `p:namespace-delete` – This step is used here as in the other tests for final cleanup of the information produced.
-### TEST-XSPEC
+### [TEST-XSPEC](../../../smoketest/TEST-XSPEC.xpl)
[XSpec](https://github.com/xspec/xspec) is a testing framework for XSLT, XQuery and Schematron. It takes the form of a vocabulary and a process (inevitably implemented in XSLT and XQuery) for executing queries, transformations, and validations, by running them over known inputs, comparing the results to expected results, and reporting the results of this comparison. XProc, built to orchestrate manipulations of XML contents, is well suited for running XSpec.
diff --git a/tutorial/sequence/Lesson02/walkthrough_102.md b/tutorial/sequence/Lesson02/walkthrough_102.md
index 557a035..36d2317 100644
--- a/tutorial/sequence/Lesson02/walkthrough_102.md
+++ b/tutorial/sequence/Lesson02/walkthrough_102.md
@@ -6,8 +6,6 @@
# 102: XProc fundamentals
-
-
## Goals
* More familiarity with XProc 3.0, with more syntax
@@ -22,13 +20,10 @@ You have done [Setup 101](../acquire/acquire_101.md), [Setup 102](../acquire/acq
Take a quick look *now* (and a longer look later):
-This tutorial's handmade [XProc links page](../../xproc-links.md)
-
-Also, the official [XProc.org dashboard page](https://xproc.org)
-
-Also, check out XProc index materials produced in this repository: [XProc docs](../../../projects/xproc-doc/readme.md)
-
-And the same pipelines you ran in setup: [Setup 101](../acquire/acquire_101.md).
+* This tutorial's handmade [XProc links page](../../xproc-links.md)
+* Also, the official [XProc.org dashboard page](https://xproc.org)
+* If interested, check out XProc index materials produced in this repository: [XProc docs](../../../projects/xproc-doc/readme.md)
+* In any case, the same pipelines you ran in setup: [Setup 101](../acquire/acquire_101.md).
## Learning more about XProc
@@ -55,7 +50,7 @@ XProc pipelines described in [the previous lesson unit](walkthrough_101.md) cont
* Yes, those conventions are enforced in the repository by [a Schematron](../../../testing/xproc3-house-rules.sch) that can be applied to any pipeline, both in development and when it is committed to the repository under CI/CD (continuous integration / continous development). Assuming we take care to run our tests and validations, this does most of the difficult work maintaining consistency, namely detecting the inconsistency.
* Reassuring messages aside, no XSpec reports are actually captured by this XProc! With nothing bound to an output port, it *sinks* by default. That is because it is a smoke test, and we care only to see that it runs and completes without error. The inputs are all controlled, so we know what those reports say. Or we can find out.
-### PRODUCE-PROJECTS-ELEMENTLIST.xpl
+### PRODUCE-PROJECTS-ELEMENTLIST
The pipeline [PRODUCE-PROJECTS-ELEMENTLIST.xpl](../../PRODUCE-PROJECTS-ELEMENTLIST.xpl) has “real-world complexity”. Reviewing its steps can give a sense of how XProc combines simple capabilities into complex operations. Notwithstanding the title of this section, it is not important to understand every detail – knowing they are there is enough.
@@ -89,7 +84,7 @@ For newcomers to XML coding – you can “comment out” code in any XM
Text
```
-becomes
+becomes:
```
diff --git a/tutorial/sequence/Lesson02/walkthrough_219.md b/tutorial/sequence/Lesson02/walkthrough_219.md
index 36c4372..dab08e3 100644
--- a/tutorial/sequence/Lesson02/walkthrough_219.md
+++ b/tutorial/sequence/Lesson02/walkthrough_219.md
@@ -18,9 +18,9 @@ More in depth.
The same pipelines you ran in setup: [Setup 101](../acquire/acquire_101.md).
-Also, [XProc.org dashboard page](https://xproc.org)
+Also, [XProc.org dashboard page](https://xproc.org).
-Also, XProc index materials produced in this repository: [XProc docs](../../../projects/xproc-doc/readme.md)
+Also, XProc index materials produced in this repository: [XProc docs](../../../projects/xproc-doc/readme.md).
## XProc as XML
@@ -84,7 +84,7 @@ Initiated in 1996, XML continues to be generative in 2024.
## Snapshot history: an XML time line
-[TODO: complete this, or move it, or both]
+[TODO: complete this, or move it, or both] ...
| Year | Publication | Capabilities | Processing frameworks | Platforms |
| --- | --- | --- | --- | --- |
diff --git a/tutorial/sequence/Lesson02/walkthrough_301.md b/tutorial/sequence/Lesson02/walkthrough_301.md
index 1431747..414d030 100644
--- a/tutorial/sequence/Lesson02/walkthrough_301.md
+++ b/tutorial/sequence/Lesson02/walkthrough_301.md
@@ -8,12 +8,12 @@
## Goals
-* See how XProc supports software testing, including testing itself, supportive of a test-driven development
+* See how XProc supports software testing, including testing itself, supportive of test-driven development (TDD)
* Exposure to the configuration of the Github repository supporting dynamic testing on Pull Requests and releases, subject to extension
## Prerequisites
-**No prerequisites**
+You have made it this far.
## Resources
@@ -39,18 +39,21 @@ Both kinds of tests can be configured and executed using XProc. Pipelines here p
Specifically, tests that are run anytime a Pull Request is updated against the home repository serve to guard against accepting non-functional code into the repository code base.
The tests themselves are so far fairly rudimentary – while paying for themselves in the consistency and quality they help enforce.
-Pipelines useful for the developer
+
+### Pipelines useful for the developer:
* [VALIDATION-FILESET-READYCHECK.xpl](../../../testing/VALIDATION-FILESET-READYCHECK.xpl) runs a pre-check to validate that files referenced in FILESET Xprocs are in place
* [REPO-FILESET-CHECK.xpl](../../../testing/REPO-FILESET-CHECK.xpl) for double checking the listed FILESET pipelines against the repository itself - run this preventatively to ensure files are not left off either list inadvertantly
* [RUN_XPROC3-HOUSE-RULES_BATCH.xpl](../../../testing/RUN_XPROC3-HOUSE-RULES_BATCH.xpl) applies House Rules Schematron to all XProcs listed in the House Rules FILESET - just like the HARDFAIL House Rules pipeline except ending gracefully with error reports
* [REPO-XPROC3-HOUSE-RULES.xpl](../../../testing/REPO-XPROC3-HOUSE-RULES.xpl) applies House Rules Schematron to all XProc documents in the repository
* [RUN_XSPEC_BATCH.xpl](../../../testing/RUN_XSPEC_BATCH.xpl) runs all XSpecs listed in the XSpec FILESET, in a single batch, saving HTML and JUnit test results
-Pipelines run under CI/CD
+
+### Pipelines run under CI/CD:
* [HARDFAIL-XPROC3-HOUSE-RULES.xpl](../../../testing/HARDFAIL-XPROC3-HOUSE-RULES.xpl) runs a pipeline enforcing the House Rules Schematron to every XProc listed in the imported FILESET pipeline, bombing (erroring out) if an error is found - useful when we want to ensure an ERROR condition comes back on an error reported by a *successful* Schematron run
* [RUN_XSPEC-JUNIT_BATCH.xpl](../../../testing/RUN_XSPEC-JUNIT_BATCH.xpl) runs all XSpecs listed in the XSpec FILESET, saving only JUnit results (no HTML reports)
-Additionally
+
+### Additionally:
* [FILESET_XPROC3_HOUSE-RULES.xpl](../../../testing/FILESET_XPROC3_HOUSE-RULES.xpl) provides a list of resources (documents) to be made accessible to importing pipelines
* [FILESET_XSPEC.xpl](../../../testing/FILESET_XSPEC.xpl) provides a list of XSpec files to be run under CI/CD
@@ -77,7 +80,7 @@ Demonstrating the capability is a more important goal, and XSpecs can and are ea
The [XSpec FILESET](../../../testing/FILESET_XSPEC.xpl) will show XSpecs run under CI/CD but not all XSpecs in the repository will be listed there.
-## XProc running under continuous integration
+## XProc running under continuous integration and development (CI/CD)
Any XProc pipelines designed, like the smoke tests or validations just described, to provide for quality checking over carefully maintained code bases, are natural candidates for running dynamically and on demand, for example when file change commits are made to git repositories under CI/CD (continuous integration / continuous deployment).
diff --git a/tutorial/sequence/Lesson02/walkthrough_401.md b/tutorial/sequence/Lesson02/walkthrough_401.md
index 37efdbd..aebacdd 100644
--- a/tutorial/sequence/Lesson02/walkthrough_401.md
+++ b/tutorial/sequence/Lesson02/walkthrough_401.md
@@ -4,34 +4,43 @@
>
> Save this file elsewhere to create a persistent copy (for example, for purposes of annotation).
-# 401: XSLT Forward and Back
+# 401: The XSLT Factor
-What is this XSLT?
+## Goals
-Read this page if you are a beginner, or an expert in XSLT, or if you plan never to use it.
+What is this XSLT? Read this page if you are a beginner in XSLT, or an expert, or if you plan never to use it.
-## Goals
+* If you don't know XSLT and do not care to, consider skimming to help you understand what is XSLT and what it does.
+* If you know XSLT or plan to learn it, read to understand something more about how it fits with XProc.
+* XQuery is also mentioned. Much of what is said about XSLT here applies to XQuery as well.
-* If you don't know XSLT, helps you understand what it is and what it does
-* If you know XSLT, understand something more about how it fits with XProc
+XSLT offers XProc a core capability. Even if not indispensable, what it brings is important and frequently necessary part of what makes XProc able to address problems with real-world complexity that evolve – or are only revealed – over time. It would be unfair to introduce developers or proprietors of data processing systems to XProc without gaining some sense of XSLT and its uses and strengths.
## Prerequisites
-Possibly, you have run and inspected pipelines mentioned earlier, most especially [PRODUCE-PROJECTS-ELEMENTLIST.xpl](../../PRODUCE-PROJECTS-ELEMENTLIST.xpl), which contains `p:xslt` steps.
+You may have run and inspected pipelines mentioned earlier, such as [PRODUCE-PROJECTS-ELEMENTLIST](../../PRODUCE-PROJECTS-ELEMENTLIST.xpl), which contain `p:xslt` steps. (If not, `p:xslt` is pretty easy to find.)
+
+Possibly, you have also inspected XSLT files (standalone transformations or *stylesheets*), to be found more or less anywhere in this repository, especially directories named `src`, with the file suffix `xsl` by convention. (XSLT being a part of XSL.)
-Possibly, you have inspected XSLT files (standalone transformations or *stylesheets*), to be found more or less anywhere in this repository, especially directories named `src`.
+XSLT practitioners might consider reading this section to see what they agree with.
## Resources
-XSLT links!
+XSLT links! Absorbing these documents is not necessary; but you need to know they exist. These provide the basis and history of the XML Data Model (XDM), the foundation of XProc.
### XSLT 1.0 and XPath 1.0
-* [XML Path Language (XPath) Version 1.0](https://www.w3.org/TR/1999/REC-xpath-19991116/) W3C Recommendation 16 November 1999
+This “Original Gangster” (OG) version is still available in browsers, and still capable, albeit not as general or powerful as it was to become.
+
+* [XML Path Language (XPath) Version 1.0](https://www.w3.org/TR/1999/REC-xpath-19991116/) W3C Recommendation 16 November 1999
* [XSL Transformations (XSLT) Version 1.0](https://www.w3.org/TR/xslt-10/) W3C Recommendation 16 November 1999
### XSLT 2.0 and XQuery 1.0
+With capabilities for grouping, better string processing (regular expressions), a more extensive type system aligned with XQuery, *temporary trees* (to reprocess results) and other needed features, XSLT 2.0 was widely deployed in document production back-ends, and used successfully within XProc 1.0.
+
+The only reason not to use it today is that XSLT 3.0/3.1 is available.
+
* [XSL Transformations (XSLT) Version 2.0 (Second Edition)](https://www.w3.org/TR/xslt20/) W3C Recommendation 30 March 2021 (Amended by W3C)
* [XQuery 1.0: An XML Query Language (Second Edition)](https://www.w3.org/TR/xquery-10/) W3C Recommendation 14 December 2010
* World Wide Web Consortium. *XQuery 1.0 and XPath 2.0 Data Model (XDM) (Second Edition)*. W3C Recommendation, 14 December 2010. See [http://www.w3.org/TR/xpath-datamodel/](https://www.w3.org/TR/xpath-datamodel/).
@@ -41,23 +50,25 @@ XSLT links!
### XSLT 3.0, XQuery 3.0, XPath 3.1
+The current generation of the language – although work progresses on XPath 4.0, more capable than ever.
+
* [XSL Transformations (XSLT) Version 3.0](https://www.w3.org/TR/xslt-30/) W3C Recommendation 8 June 2017
* [Normative references](https://www.w3.org/TR/xslt-30/#normative-references) for XSLT 3.0 - data model, functions and operators, etc., including **XPath 3.1**
* [XQuery 3.0: An XML Query Language](https://www.w3.org/TR/xquery-30/) W3C Recommendation 08 April 2014
## XSLT: XSL (XML Stylesheet Language) Transformations
-XSLT has a long and amazing history to go with its checkered reputation. Its role in XProc is similarly ambiguous: in one sense it is an optional power feature: a nice-to-have. In another sense it can be regarded as foundational. XSLT is the reason to have XProc.
+XSLT has a long and amazing history to go with its checkered reputation. Its role in XProc is similarly ambiguous: in one sense it is an optional power feature: a nice-to-have. In another sense it can be regarded as foundational. One of the best reasons to have XProc is in how easy it makes it to deploy and run XSLT.
-Chances are good that if you are not current on the latest XSLT version, you have little idea of what we are talking about, as it may have changed quite a bit (and even despite external appearances) since you last saw it. You may think you know it but you might have to reconsider.
+Chances are good that if you are not current on the latest XSLT version, you have little idea of what we are talking about, as despite appearances, it may have changed quite a bit since you last saw it. You may think you know it but you might have to reconsider.
Users who last used XSLT 1.0 and even 2.0, in particular, can consider their knowledge out of date until they have taken a look at XSLT 3.0.
-Moreover, within the context of XProc, experienced users of XSLT should also consider it may take a different form, as it is unburdened of some operations that better belong outside it - operations such as those provided by XProc itself. Within XProc, XSLT may often be simpler than out in systems where it has to do more work.
+Moreover, within the context of XProc, experienced users of XSLT should also consider it may take a different form, as it is unburdened of some operations that better belong outside it – often, operations such as those provided by XProc itself. Within XProc, XSLT may often be simpler than in systems where it has to do more work.
-Over time, the principle of pipelining, iterative amelioration (as it might be described) or “licking into shape” has been repeatedly demonstrated. Of course it proves easier to do a complicated task when broken into a series of simpler tasks. On Java alone, ways of deploying operations in sequence include at least [Apache Ant](https://ant.apache.org/), Apache Tomcat/[Cocoon](https://cocoon.apache.org/) (a web processing framework), XQuery (such as [BaseX](https://basex.org/) or [eXist-db](https://exist-db.org/exist/apps/homepage/index.html) engines) and XSLT ([Saxon](https://www.saxonica.com/documentation12/index.html#!functions/fn/transform)) to say nothing of batch scripts, shell scripts and “transformation scenarios” or the like, as offered by XML tools and toolkits.
+Over time, we have seen repeated demonstrations of the principle of pipelining, iterative amelioration (as it might be described) or “licking into shape” as applied to document processing. Of course it proves easier to do a complicated task when it is broken into a series of simpler tasks. On Java alone, ways of deploying transformations and modifications into sequences of steps include at least [Apache Ant](https://ant.apache.org/), Apache Tomcat/[Cocoon](https://cocoon.apache.org/) (a web processing framework), XQuery (using engines such as [BaseX](https://basex.org/) or [eXist-db](https://exist-db.org/exist/apps/homepage/index.html) engines) and XSLT itself ([Saxon](https://www.saxonica.com/documentation12/index.html#!functions/fn/transform)), to say nothing of batch scripts, shell scripts and “transformation scenarios” or the like, as offered by XML tools and toolkits.
-All can be disturbingly haphazard. In contrast to the varied stopgap solutions, XProc helps quite a bit by taking over from XSLT, to whatever extent necessary and useful, any aspects of processing that require any sort of interaction with the wider system. This way XSLT plays to its strengths, while XProc standardizes and simplifies how it works. On one hand, XProc enables XSLT when needed, while on the other XProc may enable us largely to do without it, offering both a useful feature set and the flexibility we need, but with less overhead, especially with regard to routine chores like designating sets of inputs and outputs, or sequencing operations. The principle of Least Power may well apply here: it saves our present and our future selves effort if we can arrange and manage to do fewer things less. XProc lets us do less.
+All this can be disturbingly haphazard. In contrast, XProc offers a single unified approach using a standard declarative vocabulary specifically for dealing with process orchestation and I/O (inputs and outputs, i.e. interfaces). Thus it helps quite a bit by taking over from XSLT, to whatever extent necessary and useful, all those aspects of processing that require any sort of interaction with the wider system. This way XSLT plays to its strengths, while XProc standardizes and simplifies how it works. Consequently, XProc enables XSLT when needed, on the one hand, while on the other XProc may enable us largely to do without it, as it *additionally* offers both its own useful feature set with regard to routine chores like designating sets of inputs and outputs, or sequencing operations. The principle of Least Power may well apply here: it saves our present and our future selves effort if we can arrange and manage to do fewer things less. XProc lets us do less.
With XSLT together, this effect is magnified. XSLT lets us write less XProc, and XProc lets us write less XSLT. Together they are easier than either would be without the other to lighten the lift.
@@ -65,15 +76,15 @@ XProc lets us use XSLT when we must, but also keeps routine and simple things bo
### Reflecting on XSLT
-Programmers can think of XSLT as a domain-specific language (DSL) or fourth-generation language (4GL) designed for the purpose of manipulating data structures suitable for documents and messages as well as for structured data sets. As such, XSLT is highly generalized and abstract and can be applied to a very broad range of problems. Its main distinguishing feature among similar languages (which tend to be functional languages such as Scala and Scheme) is that it is optimized for use specifically with XML-based data formats, offering well-defined handling of information sets expressed in XML, while the language itself uses XML syntax, affording nice composability, reflection and code generation capabilities. XSLT's processing model is both broadly applicable, and workable in a range of environments including client software or within encapsulated, secure software configurations and deployments.
+Programmers can think of XSLT as a domain-specific language (DSL) or fourth-generation language (4GL) designed for the purpose of manipulating data structures suitable for documents and messages as well as for structured data sets. As such, XSLT is highly generalized and abstract and can be applied to a very broad range of problems. Its main distinguishing feature among similar languages (which tend to be functional languages such as Scala and Scheme) is that it is optimized for use specifically with XML-based data formats, offering well-defined handling of information sets expressed in XML, while the language itself uses XML syntax, affording nice composability, reflection and code generation capabilities. XSLT's processing model is both broadly applicable, and workable in a range of environments from widely distributed client software, to encapsulated (“containerized”), secure software configurations and deployments.
-If your XSLT is strong enough, you don't need XProc, or not much. But as a functional language, XSLT is best used in a functionally pure, “stateless” way that does not interact with the system: no “side effects”. This is related to its definitions of conformant processing (X inputs produce Y outputs) and the determinism, based in mathematical formalisms, that underlies its idea of conformance. However one cost of mathematical purity is that operations that do interact with stateful externalities – things such as reading and writing files – are not in XSLT's “comfort zone”. XSLT works by defining what a new structure A' should look like for any given structure A, using such terms as a conformant XSLT engine can then effectuate. But to turn an actual A into an actual A' we must first acquire A – or an effective surrogate thereof – and moreover make our A' available, in some form. XSLT leaves it up to its processor and “calling application” to handle this aspect of the problem – which they typically do by offering interfaces for an XSLT transformation's nominal *source* and (primary) *result*. Often, this gap has been bridged by extended functionality in processors. Does your processor read and parse XML files off the file system? Can it be connected to upstream data producers in different ways? Can it use HTTP `GET` and `PUT`? The answer may be Yes to any or all of these. Throughout its history, XSLT in later versions was also extended in this direction, with features such as the `collection()` function, `xsl:result-document`, `doc-available()` and other features we may not need if we are using XProc.
+If your XSLT is strong enough, you don't need XProc, or not much. But as a functional language, XSLT is best used in a functionally pure, “stateless” way that does not interact with the system: no “side effects”. This is related to its definitions of conformant processing (X inputs produce Y outputs) and the determinism, based in mathematical formalisms, that underlies its idea of conformance. However one cost of mathematical purity is that operations that do interact with stateful externalities – operations such as reading and writing files – are not in XSLT's “comfort zone”. XSLT works by defining what a new structure A' should look like for any given structure A, using such terms as a conformant XSLT engine can then effectuate. But to turn an actual A into an actual A' we must first acquire A – or an effective surrogate thereof – and then make our A' available, in some form. XSLT leaves it up to its processor and “calling application” to handle this aspect of the problem – which they typically do by offering interfaces for an XSLT transformation's nominal *source* and (primary) *result*. Often, this gap has been bridged by extended functionality in processors. Does your processor read and parse XML files off the file system? Can it be connected to upstream data producers in different ways? Can it use HTTP `GET` and `PUT`? The answer may be Yes to any or all of these. Throughout its history, XSLT in later versions was also extended in this direction, with features such as the `collection()` function, `xsl:result-document`, `doc-available()` and other features we may not need if we are using XProc.
-### Running XSLT without XProc
+Much of this can be set aside when using XSLT with XProc, making the XSLT simpler and easier.
-As a standard and an externally-specified technology, XSLT can in principle be implemented on any platform, but the leading XSLT implementation for some years has been Saxon, produced by Saxonica of Reading, England. Saxon has achieved market share and developer support on a record of strictly-conformant, performant applications, deployed as an open-source software product free for developers to use and integrate. (While doing this, Saxonica also has related product offerings including optimized processor for those who choose to support it.)
+### Running XSLT without XProc
-Download and run Saxon to apply XSLT to XML and other inputs, without XProc.
+XSLT can also be run without XProc, often to exactly the same ends. But as you start addressing more complex requirements, you might find yourself reinventing XProc wheels in XSLT....
## Using XSLT in XProc: avoiding annoyances
@@ -81,52 +92,58 @@ If you are an experienced XSLT user, congratulations! The power XProc puts into
There are a couple of small but potentially annoying considerations when embedding XSLT literals in your XProc code. They do not apply when your XSLT is called from out of line, acquired by binding to an input port or even `p:load`. If you acquire and even manipulate your XSLT without including literal XSLT code in your XProc, that eliminates the syntax-level clashes at the roots of both these problems.
-### Text and attribute value syntax in embedded XSLT
+### Namespaces in and for your XSLT
-XSLT practitioners know that within XSLT, in attributes and (in XSLT 3.0) within text (as directed), the curly brace signs `{` and `}` have special semantics as [attribute](https://www.w3.org/TR/xslt-30/#attribute-value-templates) or [text value templates](https://www.w3.org/TR/xslt-30/#text-value-templates). In the latter case, the operation can be controlled with an `xsl:expand-text` setting. When effective as template delimiters, these characters can be escaped and hidden from processing by doubling them: `{{` for `{` etc.
+[A subsequent Lesson Unit on namespaces in XProc](../oscal-convert/oscal-convert_350.md) may help newcomers or anyone mystified by XML namespaces. They are worth mentioning here because everything tricky in XProc regarding namespaces is doubly tricky with XSLT in the picture.
-XProc offers a similar feature for expanding expressions dynamically, indicated with a `p:expand-text` setting much like XSLT's.
+In brief: keep in mind XSLT has its own features for both configuring namespace-based matching on elements by name (such as `xpath-default-namespace`), and for managing namespaces in serialization (`exclude-namespace-prefixes`). In the XProc context, however, your XSLT will typically not be writing results directly, instead only producing the same kind of (XDM) tree as is emitted and consumed by other steps.
-Because they both operate, an XSLT author must take care to provide for the correct escaping (sometimes more than one level) or settings on either language's `expand-text` option. Searching the repository for the string value `{{` (two open curlies together) will turn up instances of this – or try [a worksheet XProc with XSLT embedded](../../worksheets/NAMESPACE_worksheet.xpl).
+### Text and attribute value syntax in embedded XSLT
-### Namespaces in and for your XSLT
+If not yet conversant with XSLT, you can read more about this in an [upcoming Lesson Unit](../oscal-convert/oscal-convert_102.md) on data conversion.
-[A lesson unit on namespaces in XProc](../oscal-convert/oscal-convert_350.md) may help newcomers or anyone mystified by XML namespaces. They are worth mentioning here because everything tricky in XProc regarding namespaces is doubly tricky with XSLT in the picture.
+XSLT practitioners know that within XSLT, in attributes and (in XSLT 3.0) within text (as directed), the curly brace signs `{` and `}` have special semantics as [attribute](https://www.w3.org/TR/xslt-30/#attribute-value-templates) or [text value templates](https://www.w3.org/TR/xslt-30/#text-value-templates). In the latter case, the operation can be controlled with an `xsl:expand-text` setting. When effective as template delimiters, these characters can be escaped and hidden from processing by doubling them: `{{` for `{` etc.
-In brief: keep in mind XSLT has its own features for both configuring namespace-based matching on elements by name (such as `xpath-default-namespace`), and for managing namespaces in serialization (`exclude-namespace-prefixes`). In the XProc context, however, your XSLT will typically not be writing results directly, instead only producing the same kind of (XDM) tree as is emitted and consumed by other steps.
+XProc offers a similar feature for expanding expressions dynamically, indicated with a `p:expand-text` setting much like XSLT's.
+
+Because they both operate, an XSLT author must take care to provide for the correct escaping (sometimes more than one level) or settings on either language's `expand-text` option. Searching the repository for the string value `{{` (two open curlies together) will turn up instances of this – or skip ahead and try [a worksheet XProc with XSLT embedded](../../worksheets/NAMESPACE_worksheet.xpl).
## Learning XSLT the safer way
If setting out to learn XSLT, pause to read the following *short* list of things to which you should give early attention, in order:
-1. Namespaces in XML and XSLT: names, name prefixes, unprefixed names and the `xpath-default-namespace` setting (not available until XSLT 2.0)
+1. Namespaces in XML and XSLT: names, name prefixes, unprefixed names and the `xpath-default-namespace` setting (not available until XSLT 2.0).
-1. Templates and modes in XSLT: template matching, `xsl:apply-templates`, built-in templates, and using modes to configure default behaviors when no template matches
+1. Templates and modes in XSLT: template matching, `xsl:apply-templates`, built-in templates, and using modes to configure default behaviors when no template matches.
-1. XPath, especially absolute and relative location paths: start easy and work up
+1. XPath, especially absolute and relative location paths such as `/child::oscal:catalog` or `path/to/node[qualified(.)]`: start easy and work up.
-Understanding each of these will provide useful insights into XProc.
+Understanding each of these will provide also provide useful insights into XProc.
## XProc without XSLT?
-XProc does not require XSLT absolutely, even if XSLT is indispensable for some XProc libraries, including those in this repository.
+As noted, XProc does not require XSLT absolutely, even if XSLT is indispensable for some XProc libraries, including those in this repository.
How could we do without it?
* Using XQuery any time queries get complicated
-* Use XProc where possible, for example steps that support matches on patterns? E.g. `p:insert`, `p:label-elements` and `p:add-attribute`
-* Reliance on iterators and `p:viewport`
-* Much smarter (declarative, data-centric) HTML or other dialect in the application space?
+* Use XProc where possible, for example steps that support matches on patterns for XSLT-like functionality. Such steps include `p:insert`, `p:label-elements`, `p:add-attribute` and others.
+* Similarly reliance on iterators and `p:viewport`
+* High-level design and refactoring: using a smarter (declarative, data-centric) HTML or other dialect in the application space to simplify transformation requirements?
Chances are, there is a limit. One thing XSLT does better than almost any comparable technology is support generalized or granular mappings between vocabularies.
-So not only creating, but also consuming HTML, is the place we begin with XSLT. But since it is also very fine for other vocabulary mappings in the middle and back, it becomes indispensable almost as soon as it is available for use.
+Typically, the place we begin with XSLT is to create HTML for viewing from an XML source. But since it is also very fine for other vocabulary mappings in the middle and back, it becomes indispensable almost as soon as it is available for use.
-An XSLT that is used repeatedly can of course always be encapsulated as an XProc step.
+An XSLT that is used repeatedly can be encapsulated as an XProc step.
## XProc, XDM (the XML data model) and the standards stack
-Another critical consideration is whether and to what extent XProc and XSLT introduce unwanted dependencies, which make them strategically not a good choice (or not a good choice for everyone) at least in comparison to alternatives. These are standards in every way including nominally, emerging as the work of organizations such as W3C and ISO, while not escaping a reputation as “boutique” or “niche” technologies. Yet alternative models – whether offered by large software vendors and service providers, or by forests of Javascript libraries, or a bespoke stack using a developers' favorite flavor of Markdown or microformats – have not all fared very well either. Often scorned, XSLT has a reputation for projects migrating away from it as much as towards it. Yet look closely, and when problems arise, XSLT is never the issue by itself. (A project choosing not to use XSLT because of a lack of understanding or skills is something differet.) Often the question is, were you even using the right tool? It helps when your application is within the sweet spot of document processing at scale (and there is a sweet spot), but even this is not an absolute rule. Sometimes the question is, are you actually fitting the capabilities of the processing model to the problem at hand. Too often, that fit happens by accident. Too often, other considerations prevail and compromises are made - then the resulting system is blamed.
+Another critical consideration is whether and to what extent XProc and XSLT introduce unwanted dependencies, which make them strategically not a good choice (or not a good choice for everyone) at least in comparison to alternatives. These are standards in every way including nominally, emerging as the work of organizations such as W3C and ISO, while not escaping a reputation as “boutique” or “niche” technologies. Yet alternative approaches to software development – whether offered by large software vendors and service providers, or by forests of Javascript libraries, or a bespoke stack using a developers' favorite flavor of Markdown or microformats – have not all fared very well either. Often spurned or ignored, XSLT has a reputation for projects migrating away from it as much as towards it. Yet look closely, and when problems arise, XSLT is never the issue by itself. (A project not able to use XSLT because of a lack of understanding or skills is something different.) Often the question is, were you even using the right tool? XSLT's reputation suffers when people decide not to use it or to migrate away. But no one talks about all the systems that take advantage of it quietly.
+
+The *Golden Hammer* is an [anti-pattern](https://en.wikibooks.org/wiki/Introduction_to_Software_Engineering/Architecture/Anti-Patterns) – related to the **Silver Bullet** – but this does not make hammers superfluous. It helps when your application is within the sweet spot of XSLT and XProc's document processing at scale (and there is a sweet spot), but even this is not an absolute rule. Sometimes the question is, are you actually fitting the capabilities of the processing model to the problem at hand. Too often, that fit happens by accident. Too often, other considerations prevail and compromises are made – then the resulting system is blamed.
So where has XML-based processing been not only tenable but rewarding over the long term? Interestingly, its success is to be found often in projects that have survived across more than one system over time, that have grown from one system into another, and that have morphed and adapted and grown new limbs. In many cases, look at them today and you do not see the same system as you would have only five years ago.
+
+Systems achieve sustainability when they are not only stable, but adaptive. This is a fine balance, but one that can be found by an evolutionary process of development and experiment. XProc 3.0 and its supporting technologies show the results of such an evolution. The demonstration should be in its ease of use combined with capability and maintainability.
diff --git a/tutorial/sequence/Lesson03/oscal-convert_101.md b/tutorial/sequence/Lesson03/oscal-convert_101.md
index fe4a8e1..1e790c5 100644
--- a/tutorial/sequence/Lesson03/oscal-convert_101.md
+++ b/tutorial/sequence/Lesson03/oscal-convert_101.md
@@ -10,17 +10,17 @@
Learn how OSCAL data can be converted between JSON and XML formats, using XProc.
-Learn something about potential problems and limitations when doing this, and about how they can be detected, avoided, prevented or mitigated.
+Learn something about potential problems and limitations when doing this, and about how they can be detected and avoided, prevented or mitigated.
-Become familiar with the idea of generic conversions between syntaxes such as XML and JSON (not always possible), versus conversions capable of handling a single class or type of documents, such as OSCAL format conversions.
+Become familiar with the idea of generic conversions between syntaxes such as XML and JSON, versus conversions capable of handling a single class or type of documents, such as OSCAL – a limitation that can also provide, in a case such as OSCAL, for a fully defined mapping supporting lossless, error-free translation across syntaxes.
## Prerequisites
-You have succeeded in prior exercises, including tools installation and setup, and ready to run pipelines.
+Having succeeded in prior exercises, including tools installation and setup, you are ready to run pipelines.
## Resources
-This unit relies on the [oscal-convert project](../../../projects/oscal-convert/readme.md) in this repository, with its files. Like all projects in the repo, it aims to be reasonably self-contained and self-explanatory. The pipelines there (described below) provide rudimentary support for data conversions “in miniature” – infrastructures that might scale up.
+This unit relies on the [oscal-convert project](../../../projects/oscal-convert/readme.md) in this repository, with its files. Like all projects in the repo, it aims to be reasonably self-contained and self-explanatory. The pipelines there (described below) provide rudimentary support for data conversions, demonstrating simplicity and scalability.
As always, use your search engine and XProc resources to learn background and terminology.
@@ -32,94 +32,92 @@ The idea here is simple: run the pipeline and observe its behaviors, including n
### [GRAB-RESOURCES](../../../projects/oscal-convert/GRAB-RESOURCES.xpl)
-Like other pipelines with this name, run this to acquire resources. In this case, XSLTs used by other XProc steps are downloaded from their release page.
+Like other pipelines with this name, run this to acquire resources. In this case, XSLTs used by other XProc steps are downloaded from the OSCAL release page.
### [BATCH-JSON-TO-XML](../../../projects/oscal-convert/BATCH_JSON-TO-XML.xpl)
-This pipeline uses an input port to include a set of JSON documents, which it translates into XML using [generic semantics defined in XPath](https://www.w3.org/TR/xpath-functions-31/#json-to-xml-mapping). For each JSON file input, an equivalent XML file with the same filename base is produced, in the same place.
+This pipeline uses an XProc input port to include a set of JSON documents, which it translates into XML using [generic semantics defined in XPath](https://www.w3.org/TR/xpath-functions-31/#json-to-xml-mapping). For each JSON file input, an equivalent XML file with the same filename base is produced, in the same place.
As posted, the pipeline reads some minimalistic fictional data, which can be found in the system at the designated paths.
-It then uses a pipeline step defined in an imported pipeline, which casts the data and stores the result for each JSON file input. In the XProc source, the imported step can be easily identified by its namespace prefix, different from the prefix given to XProc elements – and what is more important, under the nominal control of its developer or sponsor.
+It then uses a pipeline step defined in an imported pipeline, which casts the data and stores the result for each JSON file input. In the XProc source, the imported step can be easily identified by its namespace prefix, different from the prefix given to XProc elements, as designated by the pipeline's developer or sponsor.
-Follow the `p:import` link (via its `href` file reference) to find the step that is imported. Recognize a step by its `type` given at the top of the XML (as is described in more depth [in a subsequent lesson unit](oscal-convert_201.md))
+Follow the `p:import` link (via its `href` file reference) to find the step that is imported. An imported step is invoked by using its `type` name, given at the top of the XML (as is described in more depth [in a subsequent lesson unit](oscal-convert_201.md)).
### [BATCH-XML-TO-JSON](../../../projects/oscal-convert/BATCH_XML-TO-JSON.xpl)
This pipeline performs the inverse of the JSON-to-XML batch pipeline. It loads XML files and converts them into JSON.
-Note however how in this case, no guarantees can be made that any XML inputs will result in valid JSON. Many XML inputs will result in errors instead since only the XML vocabulary supporting JSON is considered well enough described for a comprehensive, rules-bound cast. Exception handling logic in the form of an XProc `p:try/p:catch` can be found in the imported pipeline step (which performs the casting).
+Note however how in this case, no guarantees can be made that any XML inputs will result in valid JSON. Unless using the correct vocabulary, XML inputs will result in errors, as no comprehensive, rules-bound cast can be defined across its variations. To alleviate this problem, exception handling logic in the form of an XProc `p:try/p:catch` can be found in the imported pipeline step (which performs the casting).
-Additionally, this variant has a fail-safe (looks for `p:choose`) that prevents the production of JSON from files not named `*.xml` – strictly speaking, this is only a naming convention, but respecting it prevents unseen and unwanted name collisions. It does *not* defend against overwriting any other files that happen to be in place with the target name.
+Additionally, this variant has a fail-safe (look for `p:choose`) that prevents the production of JSON from files not named `*.xml` – strictly speaking, this is only a naming convention, but respecting it prevents unseen and unwanted name collisions. It does *not* defend against overwriting any other files that happen to be in place with the target name.
### [CONVERT-OSCAL-XML-DATA](../../../projects/oscal-convert/CONVERT-OSCAL-XML-DATA.xpl)
-The requirement that any XML to be converted must already be JSON-ready by virtue of conforming to a JSON-descriptive vocabulary, is obviously an onerous one. To achieve a clean, complete and appropriate recasting and relabeling of data, depending on its intended semantics, is .
+The requirement that any XML to be converted must already be JSON-ready by virtue of conforming to a JSON-descriptive vocabulary, is obviously an onerous one. To achieve a clean, complete and appropriate recasting and relabeling of data, depending on its intended semantics, depends on those semantics being defined in a way capable of a JSON expression.
-OSCAL solves this problem by defining its XML and JSON formats in parity, such that a complete bidirectional conversion can be guaranteed over data sets already schema-valid. The bidirectional conversions themselves can be performed implicitly or overtly by tools that read and write OSCAL, or they can be deployed as XSLT transformations, providing for the conversion to be performed by any XSLT engine, in principle (that supports the required version of the language).
+OSCAL solves this problem by defining its XML and JSON formats in parity, such that a complete bidirectional conversion can be guaranteed over data sets already schema-valid. The bidirectional conversions themselves can be performed implicitly or overtly by tools that read and write OSCAL, or they can be deployed as XSLT transformations, providing for the conversion to be performed by any XSLT 3.0 engine.
-XProc has Saxon for its XSLT, so it falls into the latter category. The XSLTs in question can be acquired from an OSCAL release, as shown in the [GRAB-RESOURCES](../../../projects/oscal-convert/GRAB-RESOURCES.xpl) pipeline.
+For XSLT 3.0, XProc has Saxon. The XSLTs in question can be acquired from an OSCAL release, as shown in the [GRAB-RESOURCES](../../../projects/oscal-convert/GRAB-RESOURCES.xpl) pipeline.
-This pipeline applies one of these XSLTS to a set of given OSCAL XML files, valid to the catalog model, to produce JSON.
-
-It works on any XML file that has `catalog` as its root element, in the OSCAL namespace. It does *not* provide for validation of the input against the schema, which is might, as a defense. Instead, the garbage-in/garbage-out (GIGO) principle is respected. If the process breaks, currently this must be discovered in the result.
+[CONVERT-OSCAL-XML-DATA](../../../projects/oscal-convert/CONVERT-OSCAL-XML-DATA.xpl) applies one of these XSLTS to a set of given OSCAL XML files, valid to the catalog model, to produce JSON. It works on any XML file that has `catalog` as its root element, in the OSCAL namespace. It does *not* provide for validation of the input against the schema: Instead, the garbage-in/garbage-out (GIGO) principle is respected. This means that some pipelines will run successfully while producing defective outputs, which must be discovered in the result (via formal validation and other checks). An XProc pipeline with a validation step preceding the conversion would present such errors earlier.
The reverse pipeline is left as an exercise. Bring valid OSCAL JSON back into XML. Let us know if you have prototyped this and wish for someone to check your work!
### [CONVERT-OSCAL-XML-FOLDER](../../../projects/oscal-convert/CONVERT-OSCAL-XML-FOLDER.xpl)
-This pipeline performs the same conversion as [CONVERT-OSCAL-XML-DATA](../../../projects/oscal-convert/CONVERT-OSCAL-XML-DATA.xpl) with an important distinction: instead of designating its sources, it processes all XML files in a designated directory.
-
-## The playing field is the internet
-
-Keep in mind that XProc in theory, and your XProc engine in practice, may read its inputs using whatever protocols it supports, while the `file` and `http` protocols are required for conformance, and work as they do on the Worldwide Web.
-
-Of course, permissions must be in place to read files from system locations, or save files to them.
-
-But when authentication is configured or resources are openly available, using `http` to reach resources or sources can be a very convenient option.
-
-## More catalogs needed!
-
-As we go to press we have only one example OSCAL catalog to use for this exercise.
-
-Other valid OSCAL catalogs are produced from other projects in this repo, specifically [CPRT import](../../../projects/CPRT-import/) and [FM6-22-IMPORT](../../../projects/FM6-22-import/). Run the pipelines in those projects to produce more catalogs (in XML) useful as inputs here.
+This pipeline performs the same conversion as [CONVERT-OSCAL-XML-DATA](../../../projects/oscal-convert/CONVERT-OSCAL-XML-DATA.xpl) with an important distinction: instead of designating its sources, it processes all XML files in a designated directory.
## Working concept: return trip
-Here's an idea: a single pipeline that would accept either XML or JSON inputs, and produce both as outputs. Would that be useful?
+Note in this context that comparing the inputs and results of a round-trip conversion is an excellent way of determining, to some base level, the correctness and validity of your data set. While converting it twice cannot guarantee that anything in your data is “true”, if having converted XML to JSON and back again to XML, the result looks the same as the original, you can be sure that your information is “correctly represented” in both formats.
-Note in this context that comparing the inputs and results of a round-trip conversion is an excellent way of determining, to some base level, the correctness and validity of your data set – as an encoded representation of *something* (expressed in OSCAL), even if not a truthful representation of *anything* (whether expressible in OSCAL or not).
+Here's an idea: a single pipeline that would accept either XML or JSON inputs, and produce either, or both, as outputs. Would that be useful?
## What is this XSLT?
If your criticism of XProc so far is that it makes it look easy when it isn't, you have a point.
-Conversion from XML to JSON isn't free, assuming it works at all.
+Conversion from XML to JSON isn't free, assuming it works at all. The runtime might be effectively free, but developing it isn't.
-In this case, the heavy lifting is done by the XSLT component - the Saxon engine invoked by the `p:xslt` step, applying logic defined in an XSLT stylesheet (aka transformation) stored elsewhere. It happens that a converter for OSCAL data is available in XSLT, so rather than having to confront this considerable problem ourselves, we drop in the solution we have at hand.
+Here, the heavy lifting is done by the XSLT component - the Saxon engine invoked by the `p:xslt` step, applying logic defined in an XSLT stylesheet (aka **transformation**) stored elsewhere. It happens that a converter for OSCAL data is available in XSLT, so rather than having to confront this considerable problem ourselves, we drop in the solution we have at hand.
-In later units we will see how using the XProc steps described, rudimentary data manipulations can be done using XProc by itself, without entailing the use of either XSLT or XQuery (another capability invoked with a different step).
+In later units we will see how using the XProc steps described, rudimentary data manipulations can be done using XProc by itself, without entailing the use of either XSLT or XQuery.
At the same time, while pipelines are based on the idea of passing data through a series of processes, there are many cases where logic is sufficiently complex that it becomes essential to maintain – and test – that logic externally from the XProc. At what point it becomes more efficient to encapsulate logic separately (whether by XSLT, XQuery or other means), depends very much on the case.
-The `p:xslt` pipeline step in particular is so important for real-world uses of XProc that it is introduced early, to show such a black-box application.
+The `p:xslt` pipeline step in particular is so important for real-world uses of XProc that it is introduced early, to show such a black-box application. There is also an [XQuery](https://spec.xproc.org/3.0/steps/#c.xquery) step – for many purposes, functionally equivalent.
-XProc also makes a fine environment for testing XSLT developed or acquired to handle specific tasks, a topic covered in more depth later.
+XProc also makes a fine environment for testing XSLT developed or acquired to handle specific tasks – and it can support automated testing and test-driven development using [XSpec](https://github.com/xspec/xspec/wiki).
Indeed XSLT and XQuery being, like XProc itself, declarative languages, it makes sense to factor them out while maintaining easy access and transparency for analysis and auditing purposes.
## What could possibly go wrong?
-Among the range of possible errors, syntax errors are relatively easy to cope with. But anomalous inputs, especially invalid inputs, can result in lost data. (A common reason data sets fail validation is the presence of foreign unknown contents, or contents out of place - the kinds of things that might fail to be converted.) The most important concern when engineering a pipeline is to see to it that no data quality problems are introduced inadvertantly. While in comparison to syntax or configuration problems, data quality issues can be subtle, there is also good news: the very same tools we use to process inputs into outputs, can also be used to test and validate data to both applicable standards and local rules.
+Three things can happen when we run a pipeline:
+
+* The pipeline can fail to run, typically terminating with an error message (or, unusually, failing to terminate)
+* The pipeline can run successfully, but result in incorrect outputs given the inputs
+* The pipeline can run successfully and correctly
+
+Among the range of possible errors, those that show up in your console with error messages are the easy ones. This will typically be a syntax error or error in usage (providing the wrong kind of input, etc.), remediable by a developer. Sometimes it is an input resource, not the pipeline, that must be corrected, or a different pipeline developed for the different input. If XML is expected but not provided, a conforming processor must emit an error. Correct it or plan on processing plain text.
+
+The second category is much harder. The most important concern when engineering a pipeline is to see to it that no data quality problems are introduced inadvertantly. Anomalous inputs might process “correctly” (for the input provided) but result in lost data or disordered results. Often this is obvious in testing, but not always. The key is defining and working within a scope of application (range of inputs) within which “correctness” can be specified, unambiguously and demonstrably, both with respect to the source data, and the processing requirement. Given such a specification, testing is possible. While in comparison to syntax or configuration problems, data quality issues can be subtle, there is also good news: the very same tools we use to process inputs into outputs, can also be used to test and validate data to both applicable standards and local rules.
Generally speaking, OSCAL maintains “validation parity” between its XML and JSON formats with respect to their schemas. That is to say, the XSD (XML schema) covers essentially the same set of rules for OSCAL XML data as the JSON Schema does for OSCAL JSON data, accounting for differences between the two notations, the data models and how information is mapped into them. A consequence of this is that valid OSCAL data, either XML or JSON, can reliably be converted to valid data in the other notation, while invalid data may come through with significant gaps, or not be converted at all.
-For this and related reasons on open systems, the working principle in XML is often to formalize a model (typically by writing and deploying a schema) as early as possible - or adopt a model already built - as a way to institute and enforce schema validation as a **prerequisite** and **primary requirement** for working with any data set. Validation against schemas is also supported by XProc, making it still easier to enforce this dependency.
+For this reason (as it applies to OSCAL) and related reasons on open systems (applying across the board, and not only to data conversions), the working principle in XML is often to define and formalize a model as early as possible – or identify and adopt a model already built – as a way to institute and enforce schema validation as a *prerequisite* and *primary requirement* for working with any data set. We do this by acquiring or writing and deploying schemas. To this end, XProc supports several kinds of schema validation including [XML DTD (Document Type Definition)](https://spec.xproc.org/lastcall-2024-08/head/validation/#c.validate-with-dtd), [XSD (W3C Schema Definition language)](https://spec.xproc.org/lastcall-2024-08/head/validation/#c.validate-with-xml-schema), [RelaxNG (ISO ISO/IEC 19757-2)](https://spec.xproc.org/lastcall-2024-08/head/validation/#c.validate-with-relax-ng), [Schematron](https://spec.xproc.org/lastcall-2024-08/head/validation/#c.validate-with-schematron) and [JSON Schema](https://spec.xproc.org/lastcall-2024-08/head/validation/#c.validate-with-json-schema), making it straightforward to enforce this dependency at any point in a pipeline, whether applied to inputs or to pipeline results including interim results and pipeline outputs. Resource validation is described further in subsequent coverage including the next [Maker lesson unit](oscal-convert_102.md).
+
+### The playing field is the Internet
-### Intercepting errors
+File resources in XProc are designated and distinguished by URIs. Keep in mind that XProc in theory, and your XProc engine in practice, may read its inputs using whatever [URI schemes](https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml) it supports, while the schemes `file` and `http` (or `https`) are required for conformance, and work as they do on the Worldwide Web.
-One way to manage the problem of ensuring input quality is to validate on the way in, either as a dependent (prerequisite) process, or built into a pipeline. Whatever you want to do with invalid inputs, including ignoring them and producing warnings or runtime exceptions, can be defined in a pipeline much like anything else.
+Of course, permissions must be in place to read files from system locations, or save files to them. When authentication is configured or resources are openly available, using `http` to reach resources or sources can be a very convenient option.
-In the [publishing demonstration project folder](../../../projects/oscal-publish/publish-oscal-catalog.xpl) is an XProc that valides XML against an OSCAL schema, before formatting it. The same could be done for an XProc that converts the data into JSON - either or both before or after conversion.
+While this is important and powerful, it comes with complications. Internet access is not always a given, making such runtime dependencies fragile. XML systems that rely on URIs frequently also support one or another kind of URI indirection, such as [OASIS XML Catalogs](https://www.oasis-open.org/committees/entity/spec-2001-08-06.html), to enable resource management, redirection and local caching of standard resources. For the XProc developer, this can be a silent source of bugs, hard to find and hard to duplicate and analyze. The [next lesson unit](oscal-convert_102.md) describes some functions that can be used to provide the transparency needed.
-Learn more about recognizing and dealing with errors in [Lesson 102](oscal-convert_102.md), or continue on to the next project, oscal-validate, for more on validation of documents and sets of documents.
+## More catalogs needed!
+
+As we go to press we have only one example OSCAL catalog to use for this exercise.
+
+Other valid OSCAL catalogs are produced from other projects in this repo, specifically [CPRT import](../../../projects/CPRT-import/) and [FM6-22-IMPORT](../../../projects/FM6-22-import/). Run the pipelines in those projects to produce more catalogs (in XML) useful as inputs here.
diff --git a/tutorial/sequence/Lesson03/oscal-convert_102.md b/tutorial/sequence/Lesson03/oscal-convert_102.md
index eb3a4e1..8fd9989 100644
--- a/tutorial/sequence/Lesson03/oscal-convert_102.md
+++ b/tutorial/sequence/Lesson03/oscal-convert_102.md
@@ -10,13 +10,13 @@
Learn how OSCAL data can be converted between JSON and XML formats, using XProc.
-Learn something about potential problems and limitations when doing this, and about how to detect, avoid, prevent or mitigate them.
+Learn about potential problems and limitations when doing this, and about how to detect, avoid, prevent or mitigate them.
-Work with XProc features designed for handling JSON data (XDM **map** objects that can be cast to XML).
+Learn something about XProc features designed for handling JSON data (XDM **map** objects that can be cast to XML).
## Prerequisites
-Run the pipelines described in [the 101 Lesson](https://github.com/usnistgov/oscal-xproc3/discussions/18)
+Run the pipelines described in [the 101 Lesson Unit](oscal-convert_101.md) in this topic.
## Resources
@@ -26,24 +26,24 @@ Same as the [101 lesson](oscal-convert_101.md).
Every project you examine provides an opportunity to alter pipelines and see how they fail when not encoded correctly – when “broken”, any way we can think of breaking them. Then build good habits by repairing the damage. Experiment and observation bring learning.
-After reading this page and [the project readme](../../../projects/oscal-convert/readme.md), run the pipelines while performing some more disassembly / reassembly. Here are a few ideas (including a few you may have already done):
+After reading this page and [the project readme](../../../projects/oscal-convert/readme.md), run the pipelines while performing some more disassembly / reassembly. Here are a few ideas:
* Switch out the value of an `@href` on a `p:document` or `p:load` step. See what happens when the file it points to is not actually there.
* There is a difference between `p:input`, used to configure a pipeline in its prologue, and `p:load`, a step that loads data. Ponder what these differences are. Try changing a pipeline that uses one into a pipeline that uses the other.
-* Similarly, there is a difference between a `p:output` configuration for a pipeline, and a `p:store` step executed by that pipeline. Consider this difference and how we might define a rule for when to prefer one or the other. How is the pipeline used - is it called directly, or intended for use as a step in other pipelines? How is it to be controlled at runtime?
+* Similarly, there is a difference between a `p:output` configuration for a pipeline, and a `p:store` step executed by that pipeline. Consider this difference and how we might define a rule for when to prefer one or the other. How is the pipeline used – is it called directly, or intended for use as a step in other pipelines? How is it to be controlled at runtime?
* Try inserting `p:store` steps into a pipeline to capture intermediate results, that is, the output of any step before they are processed by the next step. Such steps can aid in debugging, among other uses.
* `@message` attributes on steps provide messages for the runtime traceback. They are optional but this repo follows a rule that any `p:load` or `p:store` should be provided with a message. Why?
-* A `p:identity` step passes its input unchanged to the next step. But can also be provided with a `@message`.
+* A `p:identity` step passes its input unchanged to the next step. It can also be provided with a `@message`. The two commonest uses of `p:identity` are probably to provide for a “no-op” option, for example within a conditional or try/catch – and to provide runtime messages to the console.
After breaking anything, restore it to working order. Create modified copies of any pipelines for further analysis and discussion.
* Concept: copy and change one of the pipelines provided to acquire a software library or resource of your choice.
-## Value templates in attributes and text: { expr }
+## Value templates in attributes and text: { XPath-expr }
-Practitioners of XQuery, XSLT and related technologies will recognize the curly-bracket characters (U+007B and U+007D) as indicators of [attribute value templates](https://www.w3.org/TR/xslt-10/#dt-attribute-value-template), [text value templates](https://www.w3.org/TR/xslt-30/#text-value-templates), or [enclosed expressions](https://www.w3.org/TR/xquery-31/#id-enclosed-expr). The expression within the braces is to be evaluated dynamically by the processor. This is one of the most useful convenience features in the language.
+Practitioners of XQuery, XSLT and related technologies will recognize the curly-bracket characters (U+007B and U+007D) as indicators of [attribute value templates](https://www.w3.org/TR/xslt-10/#dt-attribute-value-template), [text value templates](https://www.w3.org/TR/xslt-30/#text-value-templates), or [enclosed expressions](https://www.w3.org/TR/xquery-31/#id-enclosed-expr). The expression within the brackets is to be evaluated dynamically by the processor. This is one of the most useful convenience features in the language.
-These quickly become invisible. Upon seeing
+[This syntax](https://spec.xproc.org/3.0/xproc/#value-templates) is concise, but expressive. Upon seeing:
```
@@ -51,16 +51,18 @@ These quickly become invisible. Upon seeing
the XProc developer understands:
-* The date, in some form (try it and see) should be written into the message
-* The variable reference `$filename` is defined somewhere, and here will expand to a string
+* The date, in some form should be written into the message. (Try it and see.) The XPath function [format-date](https://www.w3.org/TR/xpath-functions-31/#func-format-date) can also be used if we want a different format: for example, `current-date() => format-date('[D] [MNn] [Y]')`.
+* The variable reference `$filename` is defined somewhere, and here will expand to a string value due to the operation of the (attribute value) template.
-If you need to see actual curly braces, escape by doubling: `{{` for the single open and `}}` for the single close.
+If you need to see actual curly brackets, escape by doubling: `{{` for the single open and `}}` for the single close.
-Extra care must be taken with embedded XSLT and XQuery due to this feature, since their functioning will depend on correctly interpreting these within literal code. Yes, double escaping is sometimes necessary. (This can be tried with [a worksheet XProc](../../worksheets/NAMESPACE_worksheet.xpl).)
+One complication arises: because XSLT and XQuery support similar syntax, clashes can occur, since their functioning will depend on correctly interpreting the syntax within literal code. Yes, this means double escaping is sometimes necessary. (This can be tried with [a worksheet XProc](../../worksheets/NAMESPACE_worksheet.xpl).)
-Setting `expand-text` to `false` on an XProc element turns this behavior off: the braces become regular braces again. [The spec also describes](https://spec.xproc.org/3.0/xproc/#expand-text-attribute) a `p:inline-expand-text` attribute that can be used in places (namely inside literal XML provided in your XProc using `p:inline`) where the regular expand-text has no effect. Either setting can be used inside elements already set, resulting in “toggling” behavior (it can be turned on and off), as any `expand-text` applies to override settings on its ancestors.
+Alternatively, setting `expand-text` to `false` on an XProc element turns this behavior off: the brackets become regular brackets again. [The spec also describes](https://spec.xproc.org/3.0/xproc/#expand-text-attribute) an attribute `p:inline-expand-text` that can be used in places where the regular `expand-text` would interfere with a functional requirement (namely the representation of literal XML provided in your XProc using `p:inline`). Either of these settings can be used inside elements already set, resulting in “toggling” behavior (it can be turned on and off), as any `expand-text`, by applying to descendants, overrides settings on its ancestors.
-## Designating an input at runtime by binding input ports
+For the most part it is enough to know that the `expand-text` setting is “on” (`true`) by default, but it can be turned off (`false`) – and (for handling edge cases) back on, lower down in the hierarchy.
+
+## Designating inputs
One potential problem with the pipelines we have looked at so far is that their inputs are hard-wired. While this is sometimes helpful, it should also be possible to apply a pipeline to an XML document (or other input) without having to designate the document inside the pipeline itself. The user or calling application should be able to say “run this pipeline, but this time with this input”.
@@ -74,7 +76,7 @@ For example, the [CONVERT-OSCAL-XML-DATA](../../../projects/oscal-convert/CONVER
```
-By default, this pipeline will pick up and process the data it finds at path `data/catalog-model/xml/cat_catalog.xml`, relative to the stylesheet. But any call to this pipeline, whether directly or as a step in another pipeline, can override this.
+By default, this pipeline will pick up and process the data it finds at path `data/catalog-model/xml/cat_catalog.xml`, relative to the pipeline instance (XProc file). But any call to this pipeline, whether directly or as a step in another pipeline, can override this.
The Morgana processor defines [a command syntax for binding inputs to ports](https://www.xml-project.com/manual/ch01.html#R_ch1_s1_2). It looks like this (when used with the script deployed with this repository):
@@ -82,13 +84,19 @@ The Morgana processor defines [a command syntax for binding input
$ ../xp3.sh *PIPELINE.xpl* -input:*portname=path/to/a-document.xml* -input:*portname=path/to/another-document.xml*
```
-Here, two different `-input` arguments are given for the same port. You can have as many as needed if the port, like this one, has `sequence="true"`, meaning any number of documents (from zero to many) can be bound to the port, and the pipeline will accommodate. When more than one port is defined, one (only) can be designated as `primary="true"`, meaning it will be provided implicitly when a port connection is required (by a step) but not given in the pipeline. Notice that the name of the port must also appear, as in `-input:portname`, since pipelines can have ports supporting sequences, but also as many input ports as it needs, named differently, for documents playing different roles in the pipeline. In place of `portname` here, a common name for a port (conventional when it is the pipeline's only or primary input) is `source`.
+Here, two different `-input` arguments are given for the same port. You can have as many as needed if the port, like this one, has `sequence="true"`, meaning any number of documents (zero, one or more) can be bound to the port, and the pipeline will accommodate. When more than one port is defined for a pipeline, one (only) can be designated as `primary="true"`, allowing it to be provided implicitly when a port connection is required (by a step) but not given in the pipeline. Notice that the name of the port must also appear in the command argument, as in `-input:portname`, since while pipelines can have ports supporting sequences, they will also have different ports, named differently, for documents playing different roles in the pipeline.
+
+In place of `portname` here, a common name for a port (conventional when it is the pipeline's only or primary input) is `source`. But you can also expect to see ports (especially secondary ports) with names like `schema`, `stylesheet` and `insertion`: port names that offer hints as to what the step does.
+
+A port designated with `sequence="true"` can be empty (no documents at all) and a process will run. But by default a single document is both expected and required.
+
+Among other things, this means that a pipeline that has ``, since it is not a sequence but also has no document, cannot be run unless a (single) document for the `x` port (as it is named here) is provided when it is invoked.
-### Binding to input ports vs p:load steps
+### Lightening the `p:load`
-XProc offers two ways to acquire data from outside the pipeline: by using `p:load` or by binding inputs to an input port using `p:input/p:document`. These are somewhat different in operation - errors produced by `p:load` cannot be detected until the pipeline is run, whereas failures with `p:input` should be detected when the pipeline itself is loaded and compiled (i.e. during *static analysis*), and processors may be able to apply different kinds of exception handling, fallbacks or support for redirects. (As always you can try, test and determine for yourself.) Apart from this distinction the two approaches have similar effects – whether to use one or the other depends often on how you expect the pipeline to be used and distributed, not on whether it works.
+As an alternative to binding inputs to using `p:input/p:document` (on a pipeline definition) or `p:with-input` (on a step invocation), XProc offers another way to acquire data from outside the pipeline: by using a `p:load` step. This is somewhat different in operation: as it is a step in itself, errors produced by `p:load` cannot be detected until the pipeline is run, whereas failures with `p:input` should be detected when the pipeline itself is loaded and compiled (i.e. during *static analysis*), and processors may be able to apply different kinds of exception handling, fallbacks or support for redirects. (As always you can try, test and determine for yourself.) Apart from this distinction the two approaches have similar effects – whether to use one or the other depends often on how you expect the pipeline to be used, distributed, and maintained, since either can work in operation.
-Although one distinction is that p:document appears on input ports, which can be overridden, this does not mean that p:document can't be essentially “private” to a pipeline or pipeline step. For example, if you wish to acquire more than a single document, without p:load, known in advance (i.e. the file names can be hard-coded), make a step like this:
+Although one distinction is that `p:document` appears on input ports, which can be overridden (or rather, set dynamically), this does not mean that `p:document` cannott be essentially “private” to a pipeline or pipeline step. For example, if you wish to acquire, without `p:load`, more than a single document known in advance (i.e. the file names can be hard-coded), provide your step (`p:identity` in this case) with inputs like so:
```
@@ -100,9 +108,9 @@ Although one distinction is that p:document appears on input ports, which can be
```
-This binds the documents to the input of an **identity** step (which supports a sequence), without exposing an input port in the main pipeline.
+This binds the documents to the input of the step (as `p:identity` supports a sequence, more than one is fine), without exposing an input port in the main pipeline.
-A more dynamic approach is sometimes useful: first, acquire a list of file names, for example:
+Combining the approaches permits another useful capability: first, acquire a list of file names, for example (here using `p:input/p:inline)`:
```
@@ -130,37 +138,25 @@ One tradeoff is that the override mechanism will be different. We override the f
This makes the second approach especially appealing if the file list can be derived from some kind of metadata resource or, indeed, `p:directory-list`….
-## Identity pipeline testbed
-
-An identity or “near-identity” or modified-identity pipeline has its uses, including diagnostics. Since inputs and outputs are supposed to look the same, any changes they show between inputs and outputs can be revealing.
-
-They are also useful for testing features in your environment or setup, for example features for resource acquisition and disposition, that is, how you get data into your pipeline and then out again.
-
-Additionally, there are actually useful operations supported by a pipeline that presents its input unchanged with respect to its model. For example, it can be used to transcode a file from one encoding to another – changing nothing in the data, but rewriting it into a different character set. This is because with XProc, transcoding does not actually happen within the pipeline, but on its boundaries - when a file is read, or written (aka serialized). So internally, a pipeline set up to do this doesn't have any action to take.
+## Warning: do you know where your source files are?
-### 0.01 - what is a “document”
+As noted in the [101 Lesson Unit](oscal-convert_101.md), one of the advantages of using URIs, over and above the Internet itself, is that systems can support URI redirection when appropriate. This will ordinarily be in order to provide local (cached) copies of standard resources, thereby mitigating the need for copying files over the Internet. While this is a powerful and useful feature – arguably essential for systems at scale – it can present problems for transparency and debugging if the resource obtained by reference to a URI is not the same as the developer (or “contract”) expects.
-Just about any kind of digital input can be an XProc document. Keeping things simple and regular, XProc's concept of document is broad enough to encompass XML, HTML, JSON and other kinds of inputs including plain text and binaries. [Read more here](oscal-convert_402.md).
+A similar problem results from variations in URI syntax, both due to syntax itself and due to the fact that URIs can be relative file paths, so `file.xml` and `../file.xml` could be the same file, or not, depending on the context of evaluation.
-### 0.1 - loading documents known or discovered in advance
+To help avoid or manage problems resulting from this (i.e., from features as bugs), XPath and XProc offer some useful functions:
-The XProc step `p:load` can be used to load the resource indicated into the pipeline.
+* XPath [resolve-uri()](https://www.w3.org/TR/xpath-functions-31/#func-resolve-uri) can be used to expand a relative URI into an absolute URI
+* XProc [p:urify](https://spec.xproc.org/3.0/xproc/#f.urify) will normalize URIs and rewrite file system paths as URIs – very useful.
+* In XProc 3.1, a new function [p:lookup-uri](https://spec.xproc.org/lastcall-2024-08/head/xproc/#f.lookup-uri) can query the processor's URI resolver regarding a URI, without actually retrieving its resource. This makes available to the developer what address is actually to be used when a URI is followed – detecting any redirection – and permits defensive code to be written when appropriate.
-Watch out, since `p:load` with `href=""` – loading the resource at the location indicated by the empty string, `""` – will load the XProc file itself. This is conformant with rules for URL resolution.
+## Probing error space – data conversions
-### 0.2 - binding a document to an input port
+Broadly speaking, problems encountered running these conversions (or indeed, transformations in general) fall into two categories, the distinction being simple, namely whether a bad outcome is due to an error in the processor and its logic, or in the data inputs provided. The term “error” here hides a great deal. So does “bad outcome”. One type of bad outcome takes the form of failures at runtime – the term “failure” again leaving questions open, while at the same time it seems fair to assume that not being able to conclude successfully is a bad thing. But other bad outcomes are not detectable at runtime. If inputs are bad (inconsistent with stated contracts such as data validation), processes can run *correctly* and deliver incorrect results: correctly representing inputs, in their incorrectness. Again, the term *correct* here is underspecified and underdefined, except in the case.
-### 0.3 - loading documents discovered dynamically with `p:directory-list`
+For these and other reasons we sometimes prefer to call them “exceptions”, while at the same time we know many errors are not actually errors in the process but in the inputs. We need reliable ways to tell this difference. A library of reliable source examples -- a test suite – is one asset that helps a great deal. Even short of unit tests, however, a great deal can be discovered when working with “bad inputs” interactively. This knowledge is especially valuable once we are dealing with examples that are only “normally bad”.
-### 0.4 - saving results to the file system
-
-### 0.5 - exposing results on an output port
-
-## Probing error space - data conversions
-
-Broadly speaking, problems encountered running these conversions fall into two categories, the distinction being simple, namely whether a bad outcome is due to an error in the processor and its logic, or in the data inputs provided. The term “error” here hides a great deal. So does “bad outcome”. One type of bad outcome takes the form of failures at runtime - the term “failure” again leaving questions open, while at the same time it seems fair to assume that not being able to conclude successfully, is bad. But other bad outcomes are not detectable at runtime. If inputs are bad (inconsistent with stated contracts such as data validation), processes can run *correctly* and deliver incorrect results: correctly representing inputs, in their incorrectness. Again, the term *correct* here is underspecified and underdefined, except in the case.
-
-For these and other reasons we sometimes prefer to call them “exceptions”, while at the same time we know many errors are not actually errors in the process but in the inputs. We need reliable ways to tell this difference. A library of reliable source examples -- a test suite – is one asset that helps a great deal. Even short of unit tests, however, a great deal can be discovered when working with “bad inputs” interactively.
+Some ideas on how to do this appear below.
### Converting broken XML or JSON
@@ -170,30 +166,58 @@ Create a syntactically-invalid (not **well-formed**) XML or JSON document - or r
### Converting not-OSCAL
-XML practitioners understand how XML can be well-formed and therefore legible for processing, without being a valid instance of a specific markup vocabulary. You can have XML, for example, without having OSCAL.
+XML practitioners understand how XML can be well-formed and therefore legible for processing, without being a valid instance of a specific markup vocabulary. You can have XML, for example, without having OSCAL. This was discussed in [the previous lesson unit](oscal-convert_101.md).
+
+But a hands-on appreciation, through experience, of how this actually looks, is better than a merely intellectual understanding of why it must be.
+
+When providing XML that is not OSCAL to a process that expects OSCAL inputs, you should properly see either errors (exceptions), or bad results (outputs missing or wrongly expressed) or both. *A tutorial is the perfect opportunity to experiment and see.*
-When providing XML that is not OSCAL to a process that expects OSCAL inputs, you should properly see either errors (exceptions), or bad results (outputs missing or wrongly expressed) or both. *Experiment and see!*
+For example, try using the OSCAL XML-to-JSON pipeline on an XProc document (which is XML, but not OSCAL).
-Detection of bad results is an important capability - why we have validation against external constraint sets such as schemas. A later unit will cover this – meanwhile, inquiries on the topic are welcome.
+The interesting thing here is how permissive XProc is, unless we code it to be jealous. Detection of bad results is an important capability, which is why we also need to be able to *validate* data against external constraint sets such as schemas, also covered in more detail later.
### Converting broken OSCAL
The same thing applies to attempting to process inputs when OSCAL is expected, yet the data sources fail to meet requirements in some important respect, sometimes even a subtle requirement, depending on the case. The more fundamental problem here is the definition of “correct” versus “broken”.
-We begin generally with the stipulation that by “OSCAL” what we mean is, any XML (or JSON or YAML) instance conformant to an OSCAL schema, and thereby defined in such a manner as to enable their convertibility. The reasoning is thus somewhat circular. If we can convert it successfully, we can claim to know it as OSCAL (by virtue of the knowledge we demonstrate in the conversion). If we know it to be OSCAL by virtue of schema validation, we have assurances also regarding its convertibility.
+We begin generally with the stipulation that by “OSCAL” what we mean is, any XML (or JSON or YAML) instance conformant to an OSCAL schema, and thereby defined in such a manner as to enable their convertibility. The reasoning is thus somewhat circular. If we can convert it successfully, we have a basis to claim it is OSCAL, by virtue of its *evident* conformance to OSCAL models in operation. If we know it to be OSCAL by virtue of schema validation, we have assurances also regarding its convertibility.
-This is because with respect to these model-based conversions, the OSCAL project also offers tools that can convert any schema-valid OSCAL XML into equivalent schema-valid JSON, while doing the same the other way, making OSCAL XML from OSCAL JSON. In either case, schema validation is invaluable for defining the boundaries of the conversion itself. Data that is not schema-valid, it is reasoned, cannot be qualified or described at all, so no straightforward mapping from arbitrary inputs can be specified. But a mapping can be specified for inputs that are known, namely OSCAL inputs. The converter respects the validation rule not by enforcing it directly, but rather by depending on it.
+In contrast, data that is not schema-valid (as can be reasoned) cannot be *confidently* and *completely* qualified or described at all, so only very simple (“global”, generic or “wildcard”) mappings from arbitrary inputs can be specified. But a mapping can be specified for inputs that are known, such as OSCAL inputs. An OSCAL converter respects the validation rules not by enforcing them directly, but rather by depending on the consistency they describe and constrain.
-Fortunately, by means of Schematron and transformations, XProc is an excellent tool not only for altering data sets, but also for detecting variances, either in inputs or its results, from any specifications that can be expressed in XPath. These capabilities – detection and amelioration – can be used together, and separately. When a pipeline cannot guarantee correct outputs, it can at least provide feedback.
+Fortunately, by means of Schematron and transformations, XProc is an excellent tool not only for altering data sets, but also for imposing such validation rules, by detecting variances, either in inputs or its results. XPath, the query language, becomes key. With XPath to identify features (both good and bad), and XProc for modifications, these capabilities – detection and amelioration – can be used together, and separately. When a pipeline cannot guarantee correct outputs, it can at least provide feedback.
-Altering XML to “break” it in various subtle ways is likely to happen by accident. Get used to the feeling by *making it happen* on purpose.
+Depending on the application and data sources, XML that is “broken” in various subtle ways is more or less inevitable. See what it looks like by making this happen on purpose.
## XProc diagnostic how-to
+These methods are noted above, but they are so important they should not be skipped.
+
### Emitting runtime messages
+Most XProc steps support a `message` attribute for designating a message to be emitted to the console or log. As shown, these also support Attribute Value Syntax for dynamic evaluation of XPath.
+
+For example, again using `p:identity`:
+
+```
+
+```
+
+This step does not change the document, but reports its current Base URI and content-type at that point in the pipeline.
+
+This can be useful information since both those properties can (and should) change based on your pipeline's operations.
+
### Saving out interim results
-`p:store`
+Learn to use the `p:store` step, if only because it is so useful for saving interim pipeline results to a place where they can be inspected.
+
+[Produce-FM6-22-chapter4](../../../projects/FM6-22-import/PRODUCE_FM6-22-chapter4.xpl) is a demonstration pipeline in this repo with a switch at the top level, in the form of an option named `writing-all`. When set to `true()`, it has the effect of activating a set of `p:store` steps within the pipeline using the XProc [use-when feature](https://spec.xproc.org/3.0/xproc/#use-when) feature, to write intermediate results. The resulting set of files is written into a `temp` directory to keep them separate from final results: they show the changes being made over the input data set, at useful points for tracing the pipeline's progress.
## Validate early and often
+
+One way to manage the problem of ensuring quality is to validate the inputs before processing, either as a dependent (prerequisite) process, or built into a pipeline. This enables a useful separation between problems resulting from bad inputs, and problems within the pipeline. Whatever you want to do with invalid inputs, including skipping or ignoring them, producing warnings or runtime exceptions, or even making corrections when possible and practical – all this can be defined in a pipeline much like anything else.
+
+Keep in mind that since XProc brings support for multiple schema languages plus XPath, “validation” could mean almost anything. This must be determined for the case.
+
+In the [publishing demonstration project folder](../../../projects/oscal-publish/publish-oscal-catalog.xpl) is an XProc that valides XML against an OSCAL schema, before running steps to convert it to HTML, for display in a browser. The same could be done for an XProc that converts OSCAL data into JSON -- since OSCAL has both XSD for XML, and JSON Schema for JSON, this could be done before the conversion, after, or both.
+
+Two projects in this repository (at time of writing) deal extensively with validation: [oscal-validate](../../../projects/oscal-validate/) and [schema-field-tests](../../../projects/schema-field-tests/).
diff --git a/tutorial/sequence/Lesson03/oscal-convert_201.md b/tutorial/sequence/Lesson03/oscal-convert_201.md
index 53192dd..ccb53ac 100644
--- a/tutorial/sequence/Lesson03/oscal-convert_201.md
+++ b/tutorial/sequence/Lesson03/oscal-convert_201.md
@@ -22,7 +22,7 @@ Pipelines throughout the repository serve as examples for the description that f
## XProc as XML
-An XProc pipeline takes the form of an XML “document entity”. Unless you are concerned to write an XML parser (which is not very likely for XProc's natural constituency), this typically means an XML file, that is to say a file encoded in plain text (typically the UTF-8 serialization of Unicode, or alternatively another form of “plain text” supported by your system or environment), and following the rules of XML syntax. These rules include how elements and attributes and other XML features are encoded in **tags** that
+An XProc pipeline takes the form of an XML “document entity”. Unless you are concerned to write an XML parser (which is not very likely for XProc's natural constituency), this typically means an XML file, that is to say a file encoded in plain text (typically the UTF-8 serialization of Unicode, or alternatively another form of “plain text” supported by your system or environment), and following the rules of XML syntax. These rules include how elements and attributes and other XML features are encoded in **tags** that:
* Follow the rules with respect to naming, whitespace, delimiters and reserved characters
* Are correctly balanced, with an end tag for every start tag – for a `` there must be a `` (an end to the start).
@@ -74,7 +74,7 @@ On `p:declare-step`, whether at the top or in a step definition within a pipelin
The name makes it possible to reference the step by name. This is often useful and sometimes more or less essential, for example for providing input to one step from another step's output. (We say “more or less essential” because the processor will produce names for itself as a fallback, if it needs them, but these are brittle, somewhat opaque – such as `!1.2.3` – and more difficult to use than the names a developer gives.)
-Understandably, the name of an XProc must be different from the names given to all the steps in the XProc (which must also be distinct).
+Understandably, the name of an XProc must be different from the names given to all the steps in the XProc (which must also be distinct).
This repository follows a rule that a step name should correspond to its file base name (i.e., without a filename suffix), so `identity_` for `identity_.xproc`, etc. But that is a rule for us, not for XProc in general.
@@ -107,7 +107,7 @@ After imports, prologue and (optional) step declarations, the step sequence that
One other complication: among the steps in the subpipeline, `p:variable` (a variable declaration) and `p:documentation` (for out-of-band documentation) are also permitted – these are not properly steps, but can be useful to have with them.
-In summary: any XProc pipeline, viewed as a step declaration, can have the following --
+In summary: any XProc pipeline, viewed as a step declaration, can have the following:
* Pipeline name and type assignment (if needed), given as attributes at the top
* **Imports**: step declarations, step libraries and functions to make available
@@ -190,7 +190,7 @@ But this is an important category, since such extensions may include XProc steps
One answer: The [XSpec smoke test](./../../../smoketest/TEST-XSPEC.xpl) calls an extension step named `ox:execute-xspec`, defined in an imported pipeline. In this document, the prefix `ox` is bound to a utility namespace, `http://csrc.nist.gov/ns/oscal-xproc3`.
-In an XProc pipeline (library or step declaration) one may also see additional namespaces, including
+In an XProc pipeline (library or step declaration) one may also see additional namespaces, including:
* The namespaces needed for XSLT, XSD, or other supported technology
* One or more namespaces deployed by the XProc author to support either steps or internal operations (for example, XSLT functions)
diff --git a/tutorial/sequence/Lesson03/oscal-convert_401.md b/tutorial/sequence/Lesson03/oscal-convert_401.md
index 62890f0..b56dfa9 100644
--- a/tutorial/sequence/Lesson03/oscal-convert_401.md
+++ b/tutorial/sequence/Lesson03/oscal-convert_401.md
@@ -8,7 +8,7 @@
## Goals
-Understand a little more about JSON and other data formats in an XML processing environment
+Understand a little more about JSON and other data formats in an XML processing environment.
## Resources
@@ -16,8 +16,6 @@ A [content-types worksheet XProc](../../worksheets/CONTENT-TYPE_worksheet.xpl) i
The pipeline [READ-JSON-TESTING.xpl](../../worksheets/READ-JSON-TESTING.xpl) provides an experimental surface for working functionality specifically related to JSON and XDM map objects.
-Find more treatment in [the next lesson unit](oscal-convert_402.md). This is a topic you can also learn by through trial and error.
-
## Exercise some options
The worksheets just cited provide an opportunity to try out `content-type` configuration options. Note how you can specify a content type that will serve as a constraint on inputs and outputs, analogous in some ways to the type signature on a function. And the step `p:content-type` serves to cast one content type to another, according to [rules given in the Specification](https://spec.xproc.org/3.0/steps/#c.cast-content-type). Note that for this step to work, both inputs and outputs must conform to certain requirements: not everything can be cast!
@@ -27,4 +25,4 @@ The worksheets just cited provide an opportunity to try out `content-type` confi
* Use the function `p:document-properties(.,'content-type')` to bring back the content type of a document on a port or in a pipeline. In XPath, `.` refers to an designated as the (dynamic) processing context: so `p:document-properties($doc,'content-type')` works for any $doc considered by XProc to be or serve as a *document*.
* Interestingly, this means we can expect to find content-type='application/json' whenever an XProc document is nothing more than an object or map – as can happen, by design.
-[READ-JSON-TESTING.xpl](../../worksheets/READ-JSON-TESTING.xpl) is a sandbox for playing with JSON objects as XDM maps. The [content-types worksheet](../../worksheets/CONTENT-TYPE_worksheet.xpl) is set up for trying content-type options on inputs and outputs.
+[READ-JSON-TESTING.xpl](../../worksheets/READ-JSON-TESTING.xpl) is a sandbox for playing with JSON objects as XDM maps. The [content-types worksheet](../../worksheets/CONTENT-TYPE_worksheet.xpl) is set up for trying content-type options on inputs and outputs.
diff --git a/tutorial/sequence/Lesson04/courseware_101.md b/tutorial/sequence/Lesson04/courseware_101.md
index e82bdfe..5687f3c 100644
--- a/tutorial/sequence/Lesson04/courseware_101.md
+++ b/tutorial/sequence/Lesson04/courseware_101.md
@@ -8,9 +8,9 @@
## Goals
-Understand better how this tutorial is produced
+Understand better how this tutorial is produced.
-See an example of a small but lightweight and scalable publishing system can be implemented in XProc and XSLT
+See an example of a small but lightweight and scalable publishing system can be implemented in XProc and XSLT.
## Prerequisites
diff --git a/tutorial/sequence/Lesson04/courseware_219.md b/tutorial/sequence/Lesson04/courseware_219.md
index 962fefc..3b41025 100644
--- a/tutorial/sequence/Lesson04/courseware_219.md
+++ b/tutorial/sequence/Lesson04/courseware_219.md
@@ -12,7 +12,7 @@ Help yourself, your team and allies.
Produce a useful spin-off from a task or problem you need to master anyway.
-Learn not only by doing but by writing it down for yourself and others
+Learn not only by doing but by writing it down for yourself and others.
## Prerequisites
@@ -33,9 +33,9 @@ However, any text editor or programmers' coding environment also works (to whate
Astute readers will have observed that a markup-based deployment invites editing. But the authoring or data acquisition model of this tutorial is not Markdown-based - Markdown is paradoxically not used for its intended purpose but as one of several **publication** formats for this data set, which is currently written in an XML-based HTML5 tag set defined for the project. By writing, querying and indexing in XHTML we can use XProc from the start. Extensibility and flexibility in publication is one of the strengths - to publish a new or rearranged tutorial sequence can be done with a few lines and commands. A drag and drop interface supporting XProc makes this even easier, while it is already installed and running under CI/CD, meaning both editorial and code quality checks can be done with every commit.
-Improving a page is as simple as editing the copy found in XXX and XXX
+Improving a page is as simple as editing the copy found in XXX and XXX ...
-Making and deploying a new pages is a little harder: XXX
+Making and deploying a new pages is a little harder: XXX ...
### Apply Schematron to your edits
diff --git a/tutorial/source/acquire/acquire_101_src.html b/tutorial/source/acquire/acquire_101_src.html
index db56efb..f1a578c 100644
--- a/tutorial/source/acquire/acquire_101_src.html
+++ b/tutorial/source/acquire/acquire_101_src.html
@@ -8,9 +8,10 @@
101: Project setup and installation
Goals
- Set up and run an XProc 3.0 pipeline in an XProc 3.0 engine. See the results.
- With a little practice, become comfortable running XProc pipelines, seeing results on a console (command
- line) window as well as in the file system.
+ Set up and run an XProc 3.0 pipeline in an XProc 3.0 engine.
+ Get some results. See them in the console (message tracebacks), the file system (new files acquired or
+ produced), or both.
+ With a little practice, become comfortable running XProc pipelines.
After the first script to get the XProc engine, we use XProc for subsequent downloads. Finishing the setup
gets you started practicing with the pipelines.
@@ -21,8 +22,9 @@ Prerequisites
If ready to proceed, you have a system with Java installed offering a JVM (Java Virtual Machine)
available on the command line (a JRE or JDK), version 8 (and later).
Tip: check your Java version from the console using java --version
.
- Also, you have a live Internet connection and the capability to download and save resources (binaries
- and code libraries) for local use.
+ Also, you have an Internet connection available and the capability to download and save resources
+ (binaries and code libraries) for local use. (There are no runtime dependencies on connecting, but some
+ XProc pipelines make requests over http/s
.)
You are comfortable entering commands on the command line (i.e. terminal or console window). For
installation, you want a bash
shell if available. On Windows, both WSL (Ubuntu) and Git Bash
have been found to work. If you cannot use bash
, the setup can be done by hand (downloading and
@@ -33,6 +35,8 @@
Prerequisites
continue to use bash
.
If you have already performed the setup as described in README and setup notes, this lesson unit will be a breeze.
+ Prior knowledge of XProc, XSLT or XML is not a prerequisite (for this or any lesson unit). If you are
+ learning as we go – at any level – welcome and please seek us out for help and feedback.
Resources
@@ -65,11 +69,12 @@ Step One: Setup
distribution.
After running the setup script, or performing the installation by hand, make sure you can run all the smoke
tests successfully.
- As noted in the docs, if you happen already to have Morgana XProc III, you do not need to
- download it again. Try skipping straight to the smoke tests. You can use a runtime script
- xp3.sh
or xp3.bat
as a model for your own, and adjust. Any reasonably recent
- version of Morgana should function if configured correctly, and we are interested if it does not.
+ As noted in the docs, if you happen already to have an XProc 3.0 processor, you do not need to download Morgana XProc III here. At time of writing
+ (December 2024) this notably includes XML Calabash
+ 3 (newly out in an alpha release). In any case, equipped with any conformant XProc 3.0/3.1
+ implemenentation. try skipping straight to the smoke tests. You can use a runtime script
+ xp3.sh
or xp3.bat
as a model for your own, and adjust.
Shortcut
If you want to run through the tutorial exercises but you are unsure of how deeply you will delve, you
@@ -101,51 +106,67 @@
Step Two: Confirm
Comments / review
Within this project as a whole, and within its subprojects, everything is done with XProc 3.0. The aim is to
- make it possible to do anything needed with XProc, and moreover to make any one thing needed with a single
- XProc pipeline, using a single script, which invokes an XProc processor to read and execute. This
- simplicity, with the replicability that goes with it, is at the center of the argument for XProc.
+ make it possible to do anything needed with XProc, regarded as a general-purpose 'scripting' solution for
+ the choreography of arbitrarily complex jobs, tasks and workflows. To support arbitrary complexity and
+ scalability together, it must be very simple. This simplicity, with the composability that goes with it, is
+ at the center of the argument for XProc.
+ You will see this simplicity at the level of top-level
+ invocation XProc pipelines designed to serve as entry points. If things are done right, these will
+ be fairly simple, well encapsulated subroutines
in potentially elegant arrangements. They in turn may
+ call on libraries of XProc pipelines for well-defined tasks.
Effectively (and much more could be said about the processing stack, dependency management and so forth)
what this means is that XProc promises the user and the developer (in either or both roles) with focused and
concentrated points of control or points of adjustment. In the field – where software is deployed and used –
things almost never just drop in
. User interfaces, APIs, dependencies and platform quirks: all these
constrain what users can do, and even developers are rarely as free as they would like to experiment and
explore.
- To the extent this is the case, this project only works if things are actually simple enough to pick up,
- use, learn and adapt.
- xp3.sh
and xp3.bat
represent attempts at this. Each of them (on its execution
- platform) enables a user to run, without further configuration, the Morgana XProcIIIse processor on any XProc
- 3.0 pipeline, assuming the appropriate platform for each (bash
in the case of the shell script,
- Windows batch command syntax for the bat
file). Other platforms supporting Java (and hence
- Morgana with its libraries) could be provided with similar scripts.
- Such a script itself must be vanilla
and generic: it simply invokes the processor with the designated
- pipeline, and stands back. The logic of operations is entirely encapsulated in the XProc pipeline
- designated. XProc 3.0 is both scalable and flexible enough to open a wide range of possibilities for data
- processing, both XML-based and using other formats such as JSON and plain text. It is the intent of this
- project not to explore and map this space – which is vast – but to show off enough XProc and related logic
- (XSLT, XSpec) to show how this exploration can be done. We are an outfitter at the beginning of what we hope
- will be many profitable voyages to places we have never been.
+ What is offered here is therefore both an example of a deployment of a demonstration solution set
+ using an open-source tool (an XProc engine capable of running the pipelines we offer), doing things that are
+ actually or potential useful (with OSCAL data), and a set of pipelines that should in principle work
+ as well in any other tool or software deployment supporting XProc 3.0.
+ But to the extent this imposes a requirement for both abstract and testable conformance (or at any rate for
+ interoperability as a proxy for that), this project only works if things are actually simple enough to pick
+ up, use, learn and adapt. xp3.sh
and xp3.bat
represent attempts at making a simple
+ deployment, easy to emulate but better yet, improve.
+ Each of these scripts (on its execution platform) enables a user to run, without further configuration, the
+ Morgana XProcIIIse processor on any
+ XProc 3.0 pipeline, assuming the appropriate platform for each (bash
in the case of the shell
+ script, Windows batch command syntax for the bat
file). Providing a similar script for XML
+ Calabash remains (with apologies to NDW) a desideratum for this project as we post this version of
+ the tutorial. Stay tuned!
+ In any case such a script itself must be vanilla
and generic: it will simply invoke the processor
+ with the designated pipeline, and stand back. (Yes, runtime arguments and settings can be provided.) The
+ logic of operations is entirely encapsulated in the XProc pipeline designated. XProc 3.0 is both scalable
+ and flexible enough to open a wide range of possibilities for data processing, both XML-based and using
+ other formats such as JSON and plain text. It is the intent of this project not to explore and map this
+ space – which is vast – but to show off enough XProc and related logic (XSLT, XSpec) to show how this
+ exploration can be done. We are an outfitter at the beginning of what we hope will be many profitable
+ voyages to places we have never been.
When running from a command line
As simple examples, these scripts show only one way of running XProc. Keep in mind that even simple
- scripts can be used in more than one way.
+ scripts can be used in more than one way.
For example, a pipeline can be executed from the project root:
$ ./xp3.sh smoketest/TEST-XPROC3.xpl
Alternatively, a pipeline can be executed from its home directory, for example if currently in the
- smoketest
directory (note the path to the script):
+ smoketest
directory (note the path to the script):
$ ../xp3.sh TEST-XPROC3.xpl
- This works the same ways on Windows, with adjustments:
+ This works the same ways on Windows, with adjustments:
> ..\xp3 TEST-XPROC3.xpl
(On Windows a bat
file suffix marks it as executable and does not have to be given
explicitly when called.)
- Windows users (and others to varying degrees) can set up a drag-and-drop based workflow – using your
- mouse or pointer, select an XProc pipeline file and drag it to a shortcut for the executable (Windows
- batch file). A command window opens to show the operation of the pipeline. See the Windows users (and others to varying degrees) can set up a drag-and-drop based workflow –
+ using your mouse or pointer, select an XProc pipeline file and drag it to a shortcut for the executable
+ (Windows batch file). A command window opens to show the operation of the pipeline. See the README for more information.
- It is important to try things out since any of these methods can be the basis of a workflow.
+ It is important to try things out since any of these methods can be the basis of a workflow.
For the big picture, keep in mind that while the command line is useful for development and demonstration
- – and however familiar XProc itself may become to the developer – to a great number of people it remains
- obscure, cryptic and intimidating if not forbidding. Make yourself comfortable at the command line!
+ – and however familiar XProc itself may become to the developer – to a great number of people it remains,
+ like XProc, obscure, cryptic and intimidating if not forbidding.
+ This is a pity because (among other reasons) the kind of layered system we will see and build here is not
+ endless or infinitely complex. Begin by making yourself comfortable at the command line. See how the
+ pieces fit together by working them.
Then too, if you have something better, by all means use it. XProc-based systems, when integrated into
tools or developer editors and environments, can look much nicer than tracebacks in a console window. The
elegance and power we are trying to cultivate are at a deeper level. First and last, the focus must be on
diff --git a/tutorial/source/acquire/acquire_102_src.html b/tutorial/source/acquire/acquire_102_src.html
index 8f5b93d..b2680c1 100644
--- a/tutorial/source/acquire/acquire_102_src.html
+++ b/tutorial/source/acquire/acquire_102_src.html
@@ -8,8 +8,7 @@
102: Examining the setup
Goals
- Look at some pipeline organization and syntax on the inside
- - Success and failure invoking XProc pipelines: an early chance to
learn to die
gracefully (to use the
- gamers' idiom).
+ - Success and failure invoking XProc pipelines: making friends with tracebacks
@@ -19,9 +18,12 @@ Prerequisites
similar pipelines.
This discussion assumes basic knowledge of coding, the Internet (including retrieving resources via
file
and http
protocols), and web-based technologies including HTML.
- XML knowledge is also assumed. In particular, XProc uses XPath
- 3.1, the query language for XML. This latest version of XPath builds on XPath 1.0, so any XPath
- experience will help. In general, any XSLT or XQuery experience will be invaluable.
+ XML knowledge is not assumed. This poses a special challenge since in addition to its XML-based
+ syntax, XProc uses the XML Data Model (XDM) along with
+ XPath 3.1, the query language for XML: together, a deep
+ topic. We make the assumption that if you already know XML, XPath, XSLT or XQuery, much will be familiar,
+ but you will be tolerant of some restatement for the sake of those who do not. (As we all start somewhere,
+ why not here.)
You will also need a programmer's plain text editor, XML/XSLT editor or IDE (integrated development
environment) for more interactive testing of the code.
@@ -33,15 +35,17 @@ Prerequisites
Step One: Inspect the pipelines
The two groupings of pipelines used in setup and testing can be considered separately.
The key to understanding both groups is to know that once the initial Setup
- script is run, Morgana can be invoked directly, as paths and scripts are already in place. In doing
- so – before extension libraries are in place – it can use only basic XProc steps, but those are enough to
- start with.
+ script is run, your processor or engine
(such as Morgana) can be invoked directly, as paths
+ and scripts are already in place. In doing so – before extension libraries are in place – it can use only
+ basic XProc steps, but those are enough to start with.
Specifically, the pipelines can acquire resources from the Internet, save them locally, and perform
unarchiving (unzipping). Having been downloaded, each library provides software that the pipeline engine
(Morgana) can use to do more.
Accordingly, the first group of pipelines (in the lib directory has
a single purpose, namely (together and separately) to download software to augment Morgana's feature
set.
+ If not using the open-source Morgana distribution, you can skip to smoke tests below, and see how far you
+ get.
- lib/GRAB-SAXON.xpl
- lib/GRAB-SCHXSLT.xpl
@@ -50,11 +54,13 @@ Step One: Inspect the pipelines
Pipelines in a second group work similarly in that each one exercises and tests capabilities provided by
software downloaded by a member of the first group.
Take a look at these files. It may be helpful (for those getting used to it) to envision the XML syntax as a
set of nested frames with labels and connectors.
@@ -93,7 +99,12 @@ For consideration
software developers together, who must define problems to be solved before approaches to them can be
found.
The open questions are: who can use XProc pipelines; and how can they be made more useful? The questions
- come up in an OSCAL context or any context where XML is demonstrably capable.
+ come up in an OSCAL context or any context where XML is demonstrably capable, or indeed anywhere we find the
+ necessity of handling data with digital tools has become inescapable.
+ In order to help answer this question, actual experience will be invaluable – part of our motive here.
+ Unless we can make the demonstration pipelines in this repository accessible, they cannot be reasoned about.
+ That accessibility requires not only open publication, but also use cases and user bases ready to take
+ advantage.
Having completed and tested the setup you are ready for work with XProc: proceed to the next lesson.
599: Meeting XProc
+
+ Goals
+ Gain some more sense of context.
+ XProc is not a simple thing, with only one way in. The territory is vast, but it has also been well charted.
+ And here we have a pathway marked in front of us.
+
Resources
A Declarative Markup Bibliography is
- available on line for future reference on this theoretical topic.
+ available on line for future reference on this interesting topic.
Some observations
- Because it is now centered on pipelines as much as on files and software packages, dependency
- management is different from other technologies including Java and NodeJS – how so?
+ Because it is now centered on pipelines built out of combining capabilities of steps
+ (which may be black boxes), as much as on files and software packages, dependency management when using
+ XProc is different from other technologies including Java and NodeJS – how so?
MorganaXProc-III is implemented in Scala, and Saxon is built in Java, but otherwise distributions including
the SchXSLT and XSpec distributions consist mainly of XSLT. This is either very good (with development and
maintenance requirements in view), or not good at all.
- Which is it, and what are the determining variables that tell you XProc is a good fit? How much of this is
- due to the high-level, abstracted nature of 4GLs including both XSLT
- 3.1 and XProc 3.0? Prior experience with XML-based systems and the problem domains in which they work well
- is probably a factor. How much are the impediments technical, and how much are they due to culture?
+ If not using Morgana but another XProc engine (at time of writing, XML Calabash 3 has been published in
+ alpha), there will presumably be analogous arrangements: contracts between the tool and its dependencies,
+ software or components and capabilities bundled and unbundled.
+ So does this work well, on balance, and what are the determining variables that tell you XProc is a good fit
+ for data processing, whether high touch, or at scale? How much of this is due to the high-level, abstracted
+ nature of 4GLs including
+ both XSLT 3.1 and XProc 3.0? Prior experience with XML-based systems and the problem domains in which they
+ work well is probably a consideration. But maybe the more important blockers have to do with culture, states
+ of knowledge, incorrect assumptions and outdated perceptions.
+ Will it always be that a developer determined to use XSLT will find a way, whereas a developer determined
+ not to, will find a way to refuse it? XProc in 2024 seems slow in adoption – maybe because everyone who
+ would want it, already has a functional equivalent in place.
+ In any case, it might also be that such neglect creates a market opportunity. Those who use these
+ technologies without advertising the fact may have the most to gain. But building the commons is also a
+ common responsibility.
+ It's all about the tools. Find ways to support your open-source developer and the software development
+ operations who offer free tools and services.
Declarative markup in action
@@ -29,16 +48,33 @@ Declarative markup in action
depend, notably XProc and XSLT but not limited to these, are both nominally and actually conformant to
externally specified standard technologies, i.e. XProc and XSLT respectively (as well as others), and
reliant to the greatest possible extent on well-documented and accessible runtimes.
- It is a tall order to ask that any code base should be both easy to integrate and use with others, and at
- the same time, functionally complete and self-sufficient. Of these two, we are lucky to get one, even if we
- are thoughtful enough to limit ourselves to building blocks. Because the world is complex, we are always
+
Is it too much to expect that any code base should be both easy to integrate and use with others, and at the
+ same time, functionally complete and self-sufficient? Of these two, we are lucky to get one, even if we are
+ thoughtful enough to limit ourselves to building blocks. Because the world is complex, we are always
throwing in one or another new dependency, along with new rule sets. The approach enabled by XML and
openly-specified supporting specifications is to work by making everything transparent as possible. We seek
for clarity and transparency at all levels (so nothing is downloaded behind the scenes, for example) while
also documenting as thoroughly as we can, including with code comments.
- Can any code base be fully self-explanatory and self-disclosing? Doubtful, even assuming those terms are
- meaningful. But one can try and leave tracks and markers, at least. We call it code
with the hope and
- intent that it should be amenable to and rewarding of interpretation.
+ Can any code base be fully self-explanatory and self-disclosing? This may be doubtful even if we can agree
+ what those terms mean. At the same time, the attempt can be made: we can try and leave tracks and markers,
+ at least. We call it code
with the hope and intent that it should be amenable to and rewarding of
+ interpretation.
+
+ Standards for documents
+ In addition to the web itself (in HTML and CSS), a number of important initiatives in recent decades have
+ capitalized on the core principles of declarative markup:
+
+