Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to convert a simple xml file to geodcat #31

Open
ArthurGenet opened this issue May 27, 2021 · 8 comments
Open

Trying to convert a simple xml file to geodcat #31

ArthurGenet opened this issue May 27, 2021 · 8 comments
Labels
enhancement New feature or request help wanted Extra attention is needed implementation-challenges Discussion of XSLT implementation challenges

Comments

@ArthurGenet
Copy link

ArthurGenet commented May 27, 2021

Hello,

I am trying to convert a ISO-19115 xml file to a geodcat file with the Python code.
The result I get is:

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:adms="http://www.w3.org/ns/adms#" xmlns:cnt="http://www.w3.org/2011/content#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcat="http://www.w3.org/ns/dcat#" xmlns:dct="http://purl.org/dc/terms/" xmlns:dctype="http://purl.org/dc/dcmitype/" xmlns:dqv="http://www.w3.org/ns/dqv#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:geodcatap="http://data.europa.eu/930/" xmlns:gsp="http://www.opengis.net/ont/geosparql#" xmlns:locn="http://www.w3.org/ns/locn#" xmlns:owl="http://www.w3.org/2002/07/owl#" 
xmlns:org="http://www.w3.org/ns/org#" xmlns:prov="http://www.w3.org/ns/prov#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:schema="http://schema.org/" xmlns:sdmx-attribute="http://purl.org/linked-data/sdmx/2009/attribute#" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:vcard="http://www.w3.org/2006/vcard/ns#"/>

Does that mean that there are problems in my xml? Do you have an xml file that works I could download?

I've also tried to run the xsl with Apache Nifi (https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.6.0/org.apache.nifi.processors.standard.TransformXml/) but there was an error:

javax.xml.transform.TransformerConfigurationException: net.sf.saxon.s9api.SaxonApiException: Stylesheet compilation failed: 1 error reported
	at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:3075)
	at org.apache.nifi.processors.standard.TransformXml.onTrigger(TransformXml.java:325)
	at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
	at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1173)
	at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
	at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
	at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.io.IOException: java.util.concurrent.ExecutionException: javax.xml.transform.TransformerConfigurationException: net.sf.saxon.s9api.SaxonApiException: Stylesheet compilation failed: 1 error reported
	at org.apache.nifi.processors.standard.TransformXml$2.process(TransformXml.java:352)
	at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:3054)
	... 12 common frames omitted
Caused by: java.util.concurrent.ExecutionException: javax.xml.transform.TransformerConfigurationException: net.sf.saxon.s9api.SaxonApiException: Stylesheet compilation failed: 1 error reported
	at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:513)
	at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:90)
	at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:237)
	at com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2313)
	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2279)
	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2155)
	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2045)
	at com.google.common.cache.LocalCache.get(LocalCache.java:3953)
	at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3976)
	at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4960)
	at org.apache.nifi.processors.standard.TransformXml$2.process(TransformXml.java:331)
	... 13 common frames omitted
Caused by: javax.xml.transform.TransformerConfigurationException: net.sf.saxon.s9api.SaxonApiException: Stylesheet compilation failed: 1 error reported
	at net.sf.saxon.jaxp.SaxonTransformerFactory.newTemplates(SaxonTransformerFactory.java:155)
	at org.apache.nifi.processors.standard.TransformXml.newTemplates(TransformXml.java:272)
	at org.apache.nifi.processors.standard.TransformXml.access$000(TransformXml.java:94)
	at org.apache.nifi.processors.standard.TransformXml$1.load(TransformXml.java:300)
	at org.apache.nifi.processors.standard.TransformXml$1.load(TransformXml.java:297)
	at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3529)
	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2278)
	... 19 common frames omitted
Caused by: net.sf.saxon.s9api.SaxonApiException: Stylesheet compilation failed: 1 error reported
	at net.sf.saxon.s9api.XsltCompiler.compile(XsltCompiler.java:546)
	at net.sf.saxon.jaxp.SaxonTransformerFactory.newTemplates(SaxonTransformerFactory.java:152)
	... 25 common frames omitted
Caused by: net.sf.saxon.trans.XPathException: Stylesheet compilation failed: 1 error reported
	at net.sf.saxon.style.Compilation.compileSingletonPackage(Compilation.java:97)
	at net.sf.saxon.s9api.XsltCompiler.compile(XsltCompiler.java:543)
@andrea-perego
Copy link
Collaborator

andrea-perego commented May 27, 2021

@ArthurGenet said:

Hello,

I am trying to convert a ISO-19115 xml file to a geodcat file with the Python code.
The result I get is:

[an empty graph]

Does that mean that there are problems in my xml? Do you have an xml file that works I could download?

I think the issue is that the GeoDCAT-AP XSLT was designed to process ISO 19115:2003 / ISO 19139:2007 records, whereas, if I'm not mistaken, your file implements the latest versions of these standards.

You can find some sample records and test the XSLT on them in the demo of the GeoDCAT-AP API:

http://geodcat-ap.semic.eu/api/

I've also tried to run the xsl with Apache Nifi (https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.6.0/org.apache.nifi.processors.standard.TransformXml/) but there was an error:

[snip]

Reading the log, I'm not sure which is the problem. Some issues raised when using SAXON were related to the fact that the GeoDCAT-AP XSLT is using XSLT 1.0 instead of XSLT 2.0, but I don't know if this is the case here.

@ArthurGenet
Copy link
Author

@andrea-perego thanks a lot for your answer
yes I think it is in ISO-19139-3 so that's probably why it doesn't work, I haven't seen really any solutions though, I guess I have to do the mapping by myself so. Or maybe convert my file to ISO-19139:2007 but I don't know where the XSL is available.

@andrea-perego
Copy link
Collaborator

A possible first step is to create a copy of the GeoDCAT-AP XSLT, and update the namespace URIs as per ISO 19139-3. This may help see if the XSLT returns something, and, in such a case, what is left out.

For the records, providing support to the new version of ISO 19115 was considered as possible future work in GeoDCAT-AP. In view of this, @AntoRot contributed a comparison between the old and new version of ISO 19115, that is included in appendix to the GeoDCAT-AP specification:

https://semiceu.github.io/GeoDCAT-AP/releases/2.0.0/#comparison-between-inspire-and-iso19115-12014

I wonder whether there is now interest in moving forward this work.

@AntoRot , @pvgenuchten , @uvoges , @sgrellet , WDYT?

@sgrellet
Copy link

Thanks @andrea-perego

"I wonder whether there is now interest in moving forward this work."

I see more and more national & community wide projects that bring around the same table people coming from both "communities".
Such an exercice is thus crucial to me to create the necessary bridge between both.

@ArthurGenet
Copy link
Author

ArthurGenet commented May 31, 2021

Hello @andrea-perego,
Thanks a lot for your help.
I am creating my metadata with the Apache SIS library, I found that the library creates ISO 19139-3 2016 by default but provides options to create metadata in ISO 19139-2 2007 (https://sis.apache.org/apidocs/org/apache/sis/xml/XML.html#METADATA_VERSION)
So I think I am now downloading the good xml and it works if I replace "gmi:MI_Metadata" by "gmd:MD_Metadata"
Some metadata are lost but I can maybe modify the XSLT to make it works for me.

@AntoRot
Copy link

AntoRot commented Jun 3, 2021

Dear @andrea-perego,

I think that this could be a useful work in view of a widespread use of the latest version of ISO 19115 and future versions of the INSPIRE TGs that will have to take into account the new ISO 19115 family standards, currently not used.

Anyway I'm available to contribute.

@pvgenuchten
Copy link

geonetwork has a basic iso19115-3 to dcat conversion, which could be a starting point

https://github.com/geonetwork/core-geonetwork/blob/main/schemas/iso19115-3.2018/src/main/plugin/iso19115-3.2018/present/metadata-rdf.xsl

@culkeri
Copy link

culkeri commented Jun 9, 2021

geonetwork has a basic iso19115-3 to dcat conversion, which could be a starting point

https://github.com/geonetwork/core-geonetwork/blob/main/schemas/iso19115-3.2018/src/main/plugin/iso19115-3.2018/present/metadata-rdf.xsl

Do you know which version of dcat this is.? GeoDCAT-AP - Version 2.0?

@andrea-perego andrea-perego added help wanted Extra attention is needed enhancement New feature or request labels Aug 25, 2021
@jakubklimek jakubklimek added the implementation-challenges Discussion of XSLT implementation challenges label Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed implementation-challenges Discussion of XSLT implementation challenges
Projects
None yet
Development

No branches or pull requests

7 participants