Skip to content

Install qEndpoint

Antoine Willerval edited this page Jan 23, 2023 · 1 revision

You have multiple ways to use install qEndpoint in your project.

Docker Images

You can use one of our preconfigured Docker images.

qacompany/qendpoint

DockerHub: qacompany/qendpoint

This Docker image contains the endpoint, you can upload your dataset and start using it.

You just have to run the image and it will prepare the environment by downloading the index and setting up the repository

docker run -p 1234:1234 --name qendpoint qacompany/qendpoint

You can also specify the size of the memory allocated by setting the docker environnement value MEM_SIZE. By default this value is set to 6G. You should not set this value below 4G because you will certainly run out of memory with large dataset. For bigger dataset, a bigger value is also recommended for big dataset, as an example, Wikidata-all won't run without at least 10GB.

docker run -p 1234:1234 --name qendpoint --env MEM_SIZE=6G qacompany/qendpoint

You can stop the container and rerun it at anytime maintaining the data inside (qendpoint is the name of the container):

docker stop qendpoint
docker start qendpoint

Note: this container may occupy a huge portion of the disk due to the size of the data index, so make sure to delete the container if you don't need it anymore like so

docker rm qendpoint

Useful tools

You can access http://localhost:1234 where there is a GUI where you can write SPARQL queries and execute them, and there is the RESTful API available which you can use to run queries from any application over HTTP like so:

curl -H 'Accept: application/sparql-results+json' localhost:1234/api/endpoint/sparql --data-urlencode 'query=select * where{ ?s ?p ?o } limit 10'

Note: first query will take some time in order to map the index to memory, later on it will be much faster!

Most of the result formats are available, you can use for example:

  • JSON: application/sparql-results+json
  • XML: application/sparql-results+xml
  • Binary RDF: application/x-binary-rdf-results-table

qacompany/qendpoint-wikidata

DockerHub: qacompany/qendpoint-wikidata

This Docker image contains the endpoint with a script to download an index containing the Wikidata Truthy statements from our servers, so you simply have to wait for the index download and start using it.

You just have to run the image and it will prepare the environment by downloading the index and setting up the repository

docker run -p 1234:1234 --name qendpoint-wikidata qacompany/qendpoint-wikidata

You can also specify the size of the memory allocated by setting the docker environnement value MEM_SIZE. By default this value is set to 6G, a bigger value is also recommended for big dataset, as an example, Wikidata-all won't run without at least 10GB.

docker run -p 1234:1234 --name qendpoint-wikidata --env MEM_SIZE=6G qacompany/qendpoint-wikidata

You can specify the dataset to download using the environnement value HDT_BASE, by default the value is wikidata_truthy, but the current available values are:

  • wikidata_truthy - Wikidata truthy statements (need at least 6G of memory)
  • wikidata_all - Wikidata all statements (need at least 10G of memory)
docker run -p 1234:1234 --name qendpoint-wikidata --env MEM_SIZE=10G --env HDT_BASE=wikidata_all qacompany/qendpoint-wikidata

You can stop the container and rerun it at anytime maintaining the data inside (qendpoint is the name of the container):

docker stop qendpoint-wikidata
docker start qendpoint-wikidata

Note: this container may occupy a huge portion of the disk due to the size of the data index, so make sure to delete the container if you don't need it anymore like so

docker rm qendpoint-wikidata

Useful tools

You can access http://localhost:1234 where there is a GUI where you can write SPARQL queries and execute them, and there is the RESTful API available which you can use to run queries from any application over HTTP like so:

curl -H 'Accept: application/sparql-results+json' localhost:1234/api/endpoint/sparql --data-urlencode 'query=select * where{ ?s ?p ?o } limit 10'

Note: first query will take some time in order to map the index to memory, later on it will be much faster!

Most of the result formats are available, you can use for example:

  • JSON: application/sparql-results+json
  • XML: application/sparql-results+xml
  • Binary RDF: application/x-binary-rdf-results-table

Standalone

You can run the endpoint with this command

java -jar endpoint.jar &

you can find a template of the application.properties file in the backend source

If you have the HDT file of your graph, you can put it before loading the endpoint in the hdt-store directory (by default hdt-store/index_dev.hdt)

If you don't have the HDT, you can upload the dataset to the endpoint by running the command while the endpoint is running

curl "http://127.0.0.1:1234/api/endpoint/load" -F "[email protected]"

where mydataset.nt is the RDF file to load, you can use all the formats used by RDF4J.

As a dependency

You can create a SPARQL repository using this method, don't forget to init the repository

// Create a SPARQL repository
SparqlRepository repository = CompiledSail.compiler().compileToSparqlRepository();
// Init the repository
repository.init();

You can execute SPARQL queries using the executeTupleQuery, executeBooleanQuery, executeGraphQuery or execute.

// execute the a tuple query
try (ClosableResult<TupleQueryResult> execute = sparqlRepository.executeTupleQuery(
        // the sparql query
        "SELECT * WHERE { ?s ?p ?o }",
        // the timeout
        10
)) {
    // get the result, no need to close it, closing execute will close the result
    TupleQueryResult result = execute.getResult();

    // the tuples
    for (BindingSet set : result) {
        System.out.println("Subject:   " + set.getValue("s"));
        System.out.println("Predicate: " + set.getValue("p"));
        System.out.println("Object:    " + set.getValue("o"));
    }
}

Don't forget to shutdown the repository after usage

// Shutdown the repository (better to release resources)
repository.shutDown();

You can get the RDF4J repository with the getRepository() method.

// get the rdf4j repository (if required)
SailRepository rdf4jRepo = repository.getRepository();