Skip to content

triple_store

Kai Blumberg edited this page Jan 6, 2022 · 25 revisions

Commands to get the PM paper 3 triple store and other technology working.

Tarql

https://tarql.github.io/

https://github.com/tarql/tarql

blog posts about Tarql: https://www.bobdc.com/blog/tarql/, https://thecaglereport.com/2021/05/18/using-tarql-to-convert-excel-spreadsheets-to-rdf/, https://www.bobdc.com/blog/sparqlcsvjoin/

requires java 1.8 or abve

git clone https://github.com/cygri/tarql

brew install maven //On my mac linux different

mvn clean install -DskipTests //Make sure to be in the tarql/ directory


// probably be good to add the `/target/appassembler/bin/` to PATH so it can be used anywhere

### testing

cd target/appassembler //get to tarql executable


sh bin/tarql --ntriples ../../examples/sample-2.sparql ../../examples/TechCrunchcontinentalUSA.csv
sh bin/tarql ../../examples/sample-2.sparql ../../examples/TechCrunchcontinentalUSA.csv
sh bin/tarql ../../examples/sample-2.sparql ../../examples/TechCrunchcontinentalUSA.csv > ../../examples/outputs/test1.ttl
sh bin/tarql --ntriples ../../examples/sample-2.sparql ../../examples/TechCrunchcontinentalUSA.csv > ../../examples/outputs/test1.rdf
sh bin/tarql ../../examples/sample-arsenal-table_2.sparql ../../examples/arsenal_table_2.csv > ../../examples/outputs/arsenal.ttl

tarql /Users/kai/Desktop/software/tarql/examples/sample-2.sparql /Users/kai/Desktop/software/tarql/examples/TechCrunchcontinentalUSA.csv

in ~/Desktop/scratch/planet_microbe/planet_microbe_functional_annotation_scripts/triples

in test1

Run tarql --tabs mini_test_go_out.sparql mini_test_go_out.tsv > mini_test_go_out.ttl //original csv version with just go term and count

in test2

tarql --tabs --dedup 100 mini_test_go_out_sample.sparql mini_test_go_out_sample.tsv > mini_test_go_out_sample.ttl

in test3

tarql -H --tabs --dedup 100 test2.sparql test_headerless.tsv > test3.ttl

in test4

tarql -H --tabs --dedup 100 go.sparql go_input.tsv > test4.ttl

Tarql is actually built using the Jena toolkit (ARQ), which means that it has many of the same capabilities and limitations that the Jena/Fuseki2 RDF server has, and can be extended in the same way that ARQ can (see https://jena.apache.org/documentation/query/library-function.html for details about the ARQ extension library).

installing apache jena

downloaded tar.gz from https://jena.apache.org/download/index.cgi

gunzip -c apache-jena-4.2.0.tar.gz | tar xopf -

add to path

# Apache Jena
export JENA_HOME=/Users/kai/scripts/apache-jena-4.2.0/
export PATH=$PATH:$JENA_HOME/bin

unfortunately this didn't quite work it sees the scripts but the java version is wrong. Probably like this post

fixed it with the following to my .bash_profile

export JAVA_8_HOME=$(/usr/libexec/java_home -v1.8)
export JAVA_12_HOME=$(/usr/libexec/java_home -v12)

alias java8='export JAVA_HOME=$JAVA_8_HOME'
alias java12='export JAVA_HOME=$JAVA_12_HOME'

# default to Java 12
java12

For RDF queries (I'm pretty sure this is what I used in my masters).

Tripple store, Peter said to use tbd2. TDB2

https://jena.apache.org/documentation/tdb2/tdb2_admin.html

https://jena.apache.org/documentation/tdb2/tdb2_cmds.html // this is super helpful

https://jena.apache.org/documentation/tdb/faqs.html

Expose triples as a SPARQL end-point accessible over HTTP. Peter said to use most recent Fuseki as a front end for data management (along with TBD).

fuseki-quick-start will need Apache Tomcat for it's webapp service.

https://jena.apache.org/documentation/fuseki2/fuseki-layout.html

This post has some scripts to automate these processes which might be useful.

./fuseki-server
http://localhost:3030/

works to get the local host version

in theory from fuseki-webapp docs the following command should work fuseki-server --tdb2 --loc=DB --update /test_DB but it's giving an error about the db is TBD2 and I'm not using the right version of the server so I should use the --tbd2 flag but I am maybe a mac OS issue? I also tried the following with the same issue.

From `~/scripts/apache-jena-fuseki-4.2.0`
java -jar fuseki-server.jar --tdb2 --loc=/Users/kai/Desktop/scratch/planet_microbe/planet_microbe_functional_annotation_scripts/triples/tbd2/DB /test_DB

Other posts fuseki-it-really-is-that-easy with some java code for queries. JENA-1930, https://stackoverflow.com/questions/63874908/fuseki-configuration, https://github.com/apache/jena/blob/main/jena-fuseki2/jena-fuseki-webapp/src/main/java/org/apache/jena/fuseki/cmd/FusekiCmd.java, http://loopasam.github.io/jena-doc/documentation/serving_data/

More doc pages: fuseki-configuration

https://github.com/apache/jena/tree/main/jena-fuseki2/examples from example config-text-tdb2.ttl I can run fuseki-server --config=config2.ttl with the location of the TBD2 database modified and it works to query it. Perhaps see https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html and https://jena.apache.org/documentation/fuseki2/fuseki-main.html

apache sparql tutorials

downloads

To install on my computer followed the directions in this post it worked when I go to http://localhost:8080/

apache-tomcat-8.5.72/bin$ ./startup.sh

Clone this wiki locally