Skip to content
Ethan Gruber edited this page Jul 12, 2017 · 4 revisions

The functionality of the harvester depends on an RDF triplestore and SPARQL endpoint to link Cultural Heritage Objects with finding aids. Any SPARQL 1.1 compliant endpoint may be used, but Apache Fuseki has been chosen for ease of use, clarity of documentation, and active user base. Any Fuseki version after 1.1 requires Java 7 installed and set as the default JVM on the server.

There are two triple databases under a single Fuseki deployment. The first is for storing the cultural heritage objects according to the DPLA MAP. It includes dpla:SourceResources, ore:Aggregations, edm:WebResources, and any associated people, places, or concepts. The service is called 'nwda', should be retained for historical purposes, but it can be changed in the Fuseki config.ttl and the Harvester config.xml. The second is called 'vocab' and it contains the triples for automated enrichment: relationships between text values in the OAI-PMH and related URIs. The detailed model is found here.

1. Download and Unzip Fuseki

Fuseki's download links and documentation are available at http://jena.apache.org/documentation/serving_data/. Download and unzip Fuseki to the server (e.g., /usr/local/projects).

2. Set up config.ttl

Fuseki can be configured with a Turtle RDF file that contains information about the service names and TDB folder location. Navigate to the Fuseki directory and create or edit config.ttl insert the following service configuration (overwriting what is already there, if necessary).

@prefix :        <#> .
@prefix fuseki:  <http://jena.apache.org/fuseki#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .

[] rdf:type fuseki:Server ;
   fuseki:services (	    
     <#nwda>
   ) .

# TDB
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .

## ---------------------------------------------------------------

#NWDA
<#nwda> rdf:type fuseki:Service ;
    # URI of the dataset -- http://host:port/ds
    fuseki:name                        "nwda" ; 
    fuseki:serviceQuery                "sparql" ;
    fuseki:serviceQuery                "query" ;
    fuseki:serviceUpdate               "update" ;
    fuseki:serviceUpload               "upload" ;
    fuseki:serviceReadWriteGraphStore  "data" ;     
    fuseki:serviceReadGraphStore       "get" ;
    fuseki:dataset                     <#nwda_tbd> ;
    .

#TBD
<#nwda_tbd> rdf:type      tdb:DatasetTDB ;
    tdb:location "nwda" .

<#vocabs> rdf:type fuseki:Service ; rdfs:label "Orbis Cascade Vocabularies" ; fuseki:name "vocabs" ; fuseki:serviceQuery "query" ; fuseki:serviceQuery "sparql" ; fuseki:serviceUpdate "update" ; fuseki:serviceUpload "upload" ; fuseki:serviceReadWriteGraphStore "data" ;
fuseki:dataset <#vocabs_tdb> ; .

<#vocabs_tdb> rdf:type      tdb:DatasetTDB ;
    tdb:location "vocabs" .

3. Edit fuseki startup script

Edit the 'fuseki' shell script and add the following three lines after the commented header and the the first line, 'usage()', around line 75. Update $FUSEKI_HOME as needed.

FUSEKI_HOME=/usr/local/projects/jena-fuseki-1.1.1
FUSEKI_CONF=config.ttl
FUSEKI_ARGS="--localhost --config=$FUSEKI_CONF"

This will make the script load the configuration in config.ttl and allow access to the server by localhost, effectively blocking write traffic from the outside.

4. Establish fuseki as a startup service

First copy the fuseki shell script to /etc/init.d. Then execute sudo update-rc.d fuseki defaults. Fuseki may now be started with sudo service fuseki start and stopped with sudo service fuseki stop.

Clone this wiki locally