-
Notifications
You must be signed in to change notification settings - Fork 9
InspectorXSLT
See the Discussion topic on this initiative.
PRs are also welcome as a form of feedback, including edits to this page.
The project has both concrete goals (enabling capabilities) and abstract goals (demonstrating principles, testing hypotheses).
"Validity" is usually determined for XML by a schema language such as XSD, RNG, DTD, sometimes in combination with Schematron (a query-based assertion language) or XPath.
However, the same set of properties or logical assertions that constitutes validity can be implemented by a single XSLT transformation, if the input rule set is limited to rules that are easily enforced with XSLT/XPath. Metaschema provides for exactly such a limitation, radically simplifying the requirements for full-stack 'schema emulation' in a simple transformation.*
This amounts to providing equivalent functionality to a schema-based validation, since the system of regularities imposed by the rules is the same, resulting in equivalent functional requirements for mappings from source ("valid" and "invalid") to results (i.e. known to be valid and invalid by virtue of testing for intrinsic properties, not say-so) - not always the same outputs, but outputs reporting the same variances from expected or defined state.
The equivalence can be demonstrated by comparing results of running an InspectorXSLT over a document or set of documents, with those running the equivalent schema validation. Among the same set of instances tested, the same documents should be reported in both cases as valid or invalid, for equivalent reasons.
* For markup language designers: Metaschema assemblies in v1.0 do not support the full range of XML element content model constructs for grouping, sequencing, and cardinality, instead limiting itself to a subset that sits cleanly within the constraints of object/property modeling in JSON or YAML.
As such an XSLT can be defined as a mapping from a Metaschema into an XSLT implementation of its rules, it can be codified and then generated by a 'Metaschema transpiler' transformation that produces the functional InspectorXSLT that checks the rules of that Metaschema module.
At this point the Inspector XSLT becomes useful in several ways:
- As a validator of your Metaschema-based XML format
- As a second validator. Where two implementations agree, the agreement itself is valuable
- Reporting an error twice from two different systems is more than twice as good as reporting it once
- Discrepancies in reporting expose variances and bugs in the implementations as well as the data
- Thus cross-checking is as good as checking and sometimes better
- As such it is also 'full stack' - it replaces both schema validation and constraints validation (as might be provided via Schematron or another rules engine), and cross-checks against both.
- Because InspectorXSLT is XSLT-based, it can be deployed across a range of platforms with consistent results
- May work in settings where you can't support another technology
- Supports a wide range of workflows
- Convenience features
- batch processing
- customizing reports
- adjusting log levels and console tracebacks
- For shops that already know XML/XSLT, or even that do not, this is a way into Metaschema
The implementation provides test metaschemas for trying out its features, or developers can use their own metaschemas or public ones such as OSCAL. We know there are bugs so patience is appreciated.
At time of writing, the distribution awaits merging and lives in a fork: https://github.com/wendellpiez/metaschema-xslt/tree/issue72-XSLT-inspectorA/src/schema-gen/InspectorXSLT
It contains readme files with specific directions (e.g., readme1 and readme2).
Please feel free to provide feedback where these directions are not clear, or whether they should be moved to this wiki.
Having generated an Inspector XSLT, it can be applied from the command line using a XSLT 3.0/3.1 engine (i.e., Saxon). Scripts are provided as demonstrations (requiring bash
but free to port) using Apache Maven for Java libraries.
Similarly, scripts are shown or the XSLT can be directly invoked to produce file (report) outputs and to batch process files, using features either of Saxon or of XProc 1.0.
Currently documented in readmes, (script and XSLT) interfaces, and code comments, these are all features that could be documented on this wiki, if they are found useful.
- Confirm generally that it works, and how well it works
- Provide guidance and any tips for user documentation
- Contribute ideas for use cases
- If you like XSpec,
- Try the XSpecs starting with functional XSpecs
- consider contributing more/better tests around more functional edges
- If you are learning XSLT
- Examine the generated Inspector XSLT for legibility, traceability and debuggability
- Does it make sense? How can it be improved for learners, analysts and assessors?
- If you really need it working, and not just experimentally
- Consider contributing to a public implementation for your Metaschema including realistic functional (test) examples
- OSCAL InspectorXSLT goes in OSCAL-xslt repository - and speak up in OSCAL channels
Even the most casual feedback is welcome. Indeed casual encouragement may be more welcome than more work to do.
- Email the principal developer w e n d e l l (dot) p i e z (at) n i s t (dot) g o v.
- A Discussion Board for this project hosts Q/A and free-form discussion
- Or bug reports / feature requests are welcome on the Issues board
- Join us in NIST Metaschema Element channel (chat)
- Clone, copy, fork or reverse engineer this work, and let us know (but consider contributing first)
On deck next might be:
- OSCAL implementation: ready-to-go InspectorXSLT for OSCAL - will go in OSCAL XSLT repository
- Delivery in the browser / SaxonJS - see PoC demo at https://pages.nist.gov/oscal-tools/demos/csx/validator/
- Proper Metapath instead of 'faking it' see iXML Breadboard
Let us know if these are the wrong priorities.
XSLT 3.0/XPath 3.1, Saxon 10+, hence Java/Maven or alternative Saxon distribution.
We would be very interested to see the tool running under any other XSLT implementation.
This project aims to demonstrate not only the capability but also the viability of this approach, in combination with other approaches.
This necessitates that testing be intelligible, traceable and comprehensive.
This effort has only started (see the testing
directory) but results are promising. XSpec testing is in place for functional testing of an InspectorXSLT (testing against the Metaschema semantics) but also unit testing for production of that XSLT from Metaschema sources. A testing harness is provided that brings no additional dependencies beyond Saxon/XSLT 3.0 and Maven (supporting a Java runtime).
Implementation of the constraints
element set allowed-values
, matches
, index-has-key
(with index
declaration), is-unique
and expect
have been implemented and lightly tested. This is a promising area of work but needs testing and demonstration over realistic data sets.
allowed-values
functionality is behind current draft spec - see https://github.com/usnistgov/metaschema/pull/413
In use as Metapath, some complex XPaths in metaschema inputs may produce bugs in constraints implementation, as this processor supports only the 'pattern subset' of XPath for purposes of defining targets for constraints (specifically the syntax for selection patterns). This may be a limitation compared to other prototype implementations of Metapath - although there are generally ways to rewrite any path that doesn't work to one that does. In a next-gen InspectorXSLT, full path parsing will be supported (see iXML, above) and this will become a non-issue, either because we can rewrite the paths, or refactor the approach entirely.