Skip to content

Commit

Permalink
Merge remote-tracking branch 'github/master'
Browse files Browse the repository at this point in the history
# Conflicts:
#	pom.xml
#	src/main/java/tv/mediagenix/xslt/transformer/server/Server.java
#	src/test/java/io/github/willemvlh/transformer/saxon/SaxonTransformerTest.java
  • Loading branch information
willemvlh committed Aug 4, 2021
2 parents 4e84f2c + a8c383d commit d188767
Show file tree
Hide file tree
Showing 100 changed files with 1,696 additions and 1,229 deletions.
23 changes: 23 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# This workflow will build a Java project with Maven
# For more information see: https://help.github.com/actions/language-and-framework-guides/building-and-testing-java-with-maven

name: Java CI with Maven

on:
push:
pull_request:

jobs:
build:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- name: Set up JDK 8
uses: actions/setup-java@v2
with:
java-version: '8'
distribution: 'adopt'
- name: Build with Maven
run: mvn -B package --file pom.xml
9 changes: 4 additions & 5 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
on:
push:
tags: v*
tags: [ v* ]
workflow_dispatch:


name: Upload Release Asset
name: Upload release

jobs:
build:
Expand All @@ -15,7 +14,7 @@ jobs:
uses: actions/checkout@v2
- name: Build project # This would actually build your project, using zip for an example artifact
run: |
mvn -B package
mvn -B clean && mvn -B install
- name: Move jar
run: cp ./target/*.jar ./target/artifact.jar
- name: Create Release
Expand All @@ -29,7 +28,7 @@ jobs:
draft: false
prerelease: false
- name: Upload Release Asset
id: upload-release-asset
id: upload-release-asset
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Expand Down
16 changes: 16 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
on:
push:
workflow_dispatch:

name: Run tests

jobs:
build:
name: Run tests
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Run tests
run: |
mvn -B clean && mvn -B package
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,9 @@ buildNumber.properties
.mvn/timing.properties
.mvn/wrapper/maven-wrapper.jar
*.jar
!lib/*.jar
*.zip
/.idea
keystore/
deploy.bat
deploy.bat
*.log
23 changes: 2 additions & 21 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,21 +1,2 @@
MIT License

Copyright (c) 2020 Willem V.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
This software is provider under the CC-BY-SA 4.0 license.
See https://creativecommons.org/licenses/by-sa/4.0/ for more information.
3 changes: 0 additions & 3 deletions META-INF/MANIFEST.MF

This file was deleted.

144 changes: 117 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,28 +1,123 @@
This application exposes a REST API to perform XSLT and XQuery transformations using the Saxon processor.
This application exposes an HTTP API to perform XSLT and XQuery transformations using the Saxon processor.

Start the server by running `java -jar saxon-1.x.jar`.
## Running

Transformations can then be invoked by sending an HTTP POST call to the server at the `/transform` or `/query` endpoint, depending
on whether you want to use XSLT or XQuery.
The default port is `5000`, but this can be configured (see below).
Start the server by running `java -jar saxon-2.x.jar`.

Following command-line options are available:

* `-c`, `--config`: Location to Saxon configuration XML file
* `-h`, `--help`: Display help
* `-i`, `--insecure`: Run with default (insecure) configuration.*
* `-l`, `--license`: Path to license file
* `-o`, `--output <arg>`: Write console output to the specified file
* `-p`, `--port <arg>`: Port on which the server runs
* `-t`, `--timeout <arg>`: The maximum time a transformation is allowed to run in milliseconds.
* `-v`, `--version`: Display Saxon version info

Transformations can then be invoked by sending an HTTP POST call to the server at the `/transform` or `/query` endpoint,
depending on whether you want to use XSLT or XQuery. The default port is `5000`, but this can be configured (see above).
This call must contain a `multipart/form-data` encoded body with two items:

* `xml` : a file or string containing the input XML.
* `xml` : a file or string containing the XML or JSON document to be transformed.
* `xsl`: a file or string containing the XSLT or XQuery input.

The `xml` parameter is not mandatory: in the case of XSLT the default named template `xsl:initial-template` will be invoked, in XQuery the query will be evaluated without a context item.
The `xml` parameter is not mandatory: in the case of XSLT the default named template `xsl:initial-template` will be
invoked, in XQuery the query will be evaluated without a context item.

An example call (using cURL) may look as follows:

`$ curl http://localhost:5000/transform -F [email protected] -F [email protected]`

The response body contains the result of the transformation. The character set of the response is the one specified in
the output parameters of the stylesheet, which defaults to UTF-8. The value of the `Content-Type` header can be
controlled by setting the `media-type` output parameter.

## JSON

Next to XML, the application also supports sending JSON as an input format. No additional parameters must be set as JSON
is automatically distinguished from XML. Note that JSON input is automatically transformed to XML using
the `json-to-xml`
function and set as the global context item.

$ curl http://localhost:5000/query -F xsl=. -F xml="[1,2,3]" -F output="indent=yes"
<?xml version="1.0" encoding="UTF-8"?>
<array xmlns="http://www.w3.org/2005/xpath-functions">
<number>1</number>
<number>2</number>
<number>3</number>
</array>

## Security

By default, Saxon assumes all input is untrusted. This means following functionalities are disabled:

* External function calls
* Retrieval of system properties and environment variables
* Accessing the file system or network

When you do want to allow this, you can either pass the `--insecure` command line parameter, or supply a custom Saxon
configuration file using the `--config` parameter. Note that this alone is not enough to protect against attackers. It
is recommended to place a proxy server in front of this application to take care of IP whitelisting, rate limiting, etc.

The amount of time that a transformation is allowed to run is 10 seconds by default. This can be configured with
the `--timeout` parameter, which takes a number in milliseconds. Use `-1` to disable timeouts.
## Compression

The input be gzip encoded in order to avoid having to send large files over the network. In this case, you must set
the `Content-Type` of the individual parts as `application/gzip`. In all other cases, no encoding is assumed.

These files can be gzip encoded. In this case, you must set the `Content-Type` of the individual parts as `application/gzip`.
In all other cases, no encoding is assumed.
To receive a compressed response, make sure to pass `Accept-Encoding: gzip` in the headers.

The response will contain the serialized result in its body.
## Parameters

Serialization parameters can be specified in the request as well. This is done by adding a form item named `output`, which contains
a list of key-value pairs separated by semi-colons. All serialization parameters supported by Saxon-HE can be used.
These parameters take precedence over the ones supplied in the stylesheet or query.
It is possible to include parameters to a transformation by including a `parameters` form item in the request.
Parameters must be included in the form `key=value` and are separated using a semicolon. Semicolons can be escaped using
a backslash.

In case of an error, the response will contain a JSON object describing the error.
For example:
Consider an example stylesheet as follows:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0">
<xsl:output method="text" encoding="UTF-8"/>
<xsl:param name="myParam"/>
<xsl:template match="/">
<xsl:value-of select="$myParam"/>
</xsl:template>
</xsl:stylesheet>

The parameter can then be set like this:

$ curl http://localhost:5000/transform -F [email protected] -F [email protected] -F parameters="myParam=example"
example

## Serialization

Serialization parameters can be specified in the request as well. This is done by adding a form item named `output`,
which is formed in the same way as normal parameters as described above. All serialization parameters supported by Saxon
can be used. These parameters take precedence over the ones supplied in the stylesheet or query.

For example, to return the result as JSON:

$ curl http://localhost:5000/query -F xsl="map{'numbers': array{1,2,3}}" -F output="method=json"
{"numbers":[1,2,3]}

## Saxon-EE

It is possible to include a path to a Saxon-EE license file using the `-l, --license` startup parameter. This allows
using enterprise features that are not found in the open-source edition such as streaming XML input or using extension
functions.

## Error handling

There are different types of errors that can occur:

* Compilation errors (for example, syntactical errors, or an invalid XSLT document)
* Runtime errors (for example, type errors or inaccessible filepaths)
* User-invoked errors using the `xsl:message` element or the `error` function

These errors are all caught and then returned in the reponse. Where possible, the line number and column inside the
stylesheet where the error was encountered is passed along. The response code in this case is always `400`.

```
{
Expand All @@ -32,19 +127,14 @@ For example:
}
```

Following command-line options are available:

* `-c`, `--config`: Location to Saxon configuration XML file
* `-h,--help`: Display help
* `-i, --insecure`: Run with default (insecure) configuration.*
* `-p, --port <arg>`: Port on which the server runs
* `-r, --rate-limit <arg>: none|light|heavy`: Enable rate limiting. `light` allows 120 requests per 60 seconds per IP address. `heavy` allows 60 requests per 60 seconds per IP address. `none` disables rate limiting.
* `-t, --timeout <arg>`: The maximum time a transformation is allowed to run in milliseconds.
* `-v, --version`: Display Saxon version info
Note that `xsl:message` elements without the `terminate=yes` attribute are ignored.

## Performance

\* This enables external function calls, retrieval of system properties and environment variables and connecting to arbitrary URLs. It also allows the usage of doctype declarations.
This option cannot be set in combination with `--config`. You should not use this when the input is untrusted.
Throughput and latency depend on the size of the payload and the complexity of the stylesheet. With small, relatively
simple stylesheets, the application can easily handle hundreds of requests per second.

## Developing

You can build from source by running the Maven command `mvn install`.
To start developing, you must first clone this repository and then run `mvn clean` (which amongst other things installs
the included Saxon library in your local repository).
18 changes: 18 additions & 0 deletions config/saxon-license.lic
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Licensor=Saxonica
Licensee=Willem Van Lishout
Company=
[email protected]
Edition=EE
SAT=yes
SAQ=yes
SAV=yes
Issued=2021-06-25
Series=V
Serial=V009697
User=P0001
Evaluation=yes
Expiration=2021-07-25
UpgradeDays=30
MaintenanceDays=30

Signature=302C0214291E445C174C20EAB7CA8F8184444673B6895E610214295E677865C7474411432F94878BCA9F98A3327C
Binary file added lib/saxon-ee-10.5.jar
Binary file not shown.
Loading

0 comments on commit d188767

Please sign in to comment.