-
Notifications
You must be signed in to change notification settings - Fork 0
triple_store
Commands to get the PM paper 3 triple store and other technology working.
https://github.com/tarql/tarql
blog posts about Tarql: https://www.bobdc.com/blog/tarql/, https://thecaglereport.com/2021/05/18/using-tarql-to-convert-excel-spreadsheets-to-rdf/, https://www.bobdc.com/blog/sparqlcsvjoin/
requires java 1.8 or abve
git clone https://github.com/cygri/tarql
brew install maven //On my mac linux different
mvn clean install -DskipTests //Make sure to be in the tarql/ directory
// probably be good to add the `/target/appassembler/bin/` to PATH so it can be used anywhere
### testing
cd target/appassembler //get to tarql executable
sh bin/tarql --ntriples ../../examples/sample-2.sparql ../../examples/TechCrunchcontinentalUSA.csv
sh bin/tarql ../../examples/sample-2.sparql ../../examples/TechCrunchcontinentalUSA.csv
sh bin/tarql ../../examples/sample-2.sparql ../../examples/TechCrunchcontinentalUSA.csv > ../../examples/outputs/test1.ttl
sh bin/tarql --ntriples ../../examples/sample-2.sparql ../../examples/TechCrunchcontinentalUSA.csv > ../../examples/outputs/test1.rdf
sh bin/tarql ../../examples/sample-arsenal-table_2.sparql ../../examples/arsenal_table_2.csv > ../../examples/outputs/arsenal.ttl
tarql /Users/kai/Desktop/software/tarql/examples/sample-2.sparql /Users/kai/Desktop/software/tarql/examples/TechCrunchcontinentalUSA.csv
in ~/Desktop/scratch/planet_microbe/planet_microbe_functional_annotation_scripts/triples
in test1
Run tarql --tabs mini_test_go_out.sparql mini_test_go_out.tsv > mini_test_go_out.ttl
//original csv version with just go term and count
in test2
tarql --tabs --dedup 100 mini_test_go_out_sample.sparql mini_test_go_out_sample.tsv > mini_test_go_out_sample.ttl
in test3
tarql -H --tabs --dedup 100 test2.sparql test_headerless.tsv > test3.ttl
in test4
tarql -H --tabs --dedup 100 go.sparql go_input.tsv > test4.ttl
Tarql is actually built using the Jena toolkit (ARQ), which means that it has many of the same capabilities and limitations that the Jena/Fuseki2 RDF server has, and can be extended in the same way that ARQ can (see https://jena.apache.org/documentation/query/library-function.html for details about the ARQ extension library).
downloaded tar.gz from https://jena.apache.org/download/index.cgi
gunzip -c apache-jena-4.2.0.tar.gz | tar xopf -
add to path
# Apache Jena
export JENA_HOME=/Users/kai/scripts/apache-jena-4.2.0/
export PATH=$PATH:$JENA_HOME/bin
unfortunately this didn't quite work it sees the scripts but the java version is wrong. Probably like this post
fixed it with the following to my .bash_profile
export JAVA_8_HOME=$(/usr/libexec/java_home -v1.8)
export JAVA_12_HOME=$(/usr/libexec/java_home -v12)
alias java8='export JAVA_HOME=$JAVA_8_HOME'
alias java12='export JAVA_HOME=$JAVA_12_HOME'
# default to Java 12
java12
For RDF queries (I'm pretty sure this is what I used in my masters).
Tripple store, Peter said to use tbd2. TDB2
https://jena.apache.org/documentation/tdb2/tdb2_admin.html
https://jena.apache.org/documentation/tdb2/tdb2_cmds.html // this is super helpful
https://jena.apache.org/documentation/tdb/faqs.html
Expose triples as a SPARQL end-point accessible over HTTP. Peter said to use most recent Fuseki as a front end for data management (along with TBD).
fuseki-quick-start will need Apache Tomcat for it's webapp service.
https://jena.apache.org/documentation/fuseki2/fuseki-layout.html
This post has some scripts to automate these processes which might be useful.
in theory the following command should work fuseki-server --tdb2 --loc=DB --update /test_DB
but it's giving an error about the db is TBD2 and I'm not using the right version of the server so I should use the --tbd2 flag but I am maybe a mac OS issue?
To install on my computer followed the directions in this post it worked when I go to http://localhost:8080/
apache-tomcat-8.5.72/bin$ ./startup.sh