Skip to content
Ethan Gruber edited this page Jul 12, 2017 · 1 revision

The Harvester transforms Dublin Core embedded in OAI-PMH into RDF/XML conforming to the DPLA Metadata Application Profile 4.0. Since the data in the Harvester are naturally DPLA conformant, exports of data are essentially DESCRIBE SPARQL queries restricted to the doap:audience of "dpla" in the ore:Aggregation (to ensure Primo-only materials do not get sent to DPLA).

The export is available in RDF/XML, TTL, and JSON-LD, essentially dictated by the output parameter for Fuseki.

  • The VoID RDF describing the data dumps are available in RDF/XML, TTL, and JSON-LD. The links are available on the index page, e.g., http://harvester.orbiscascade.org/void.jsonld. The VoID for a given serialization contains multiple links to void:dataDumps in the same serialization. There are 5000 CHOs per dump, a value that is set in the dpla_limit within the Harvester config.xml
  • The following query will get all dpla:SourceResources, edm:WebResources, ore:Aggregations, edm:Agents, and skos:Concepts with a %LIMIT% set in the config, above: https://gist.github.com/ewg118/2b1a6f25cae80dceff896316740155f1
  • The model for the SPARQL query is in xpl/models/sparql/aggregate.xpl
Clone this wiki locally