QLever (pronounced "clever") is an efficient SPARQL engine supporting large datasets including the full Wikidata (7 billion triples). Even on very large datasets QLever uses only about 40 GB RAM, builds indices in less than 12 hours and executes most queries in less than a second.
On top of the standard SPARQL functionality, QLever also supports SPARQL+Text search and SPARQL autocompletion; these are described on the advanced features page.
A demo of QLever on a variety of large datasets, including Wikidata, can be found here.
The basic design behind QLever was described in this CIKM'17 paper. If you use QLever in your work, please cite that paper.
If you want to skip the details and just get a running QLever instance to play around with. Follow the quickstart guide.
Alternatively to get started with a real (and really big) dataset we have prepared a Wikidata Quickstart Guide. This guide takes you through the entire process of loading the full Wikidata Knowledge Base into QLever, but don't worry it is pretty simple.
The rest of this page is organized in the following sections. Taking you through the steps necessary to get a QLever instance up and runnining starting from a simple Turtle dump of a Knowledge Base.
Further documentation is available on the following topics
- Supported Knowledge Bases and Example Queries
- SPARQL + Text
- Advanced Features
- Native Setup
- Troubleshooting
- Quickstart Guide
- Wikidata Quickstart Guide
We recommend using QLever with docker. If you absolutely want to run QLever directly on your host see here.
The installation requires a 64-bit system, docker version 18.05 or newer and
git
.
git clone --recursive https://github.com/ad-freiburg/QLever.git qlever
cd qlever
docker build -t qlever .
This creates a docker image named "qlever" which contains everything needed to use QLever. If you want to be sure that everything is working as it should before proceeding, you can run the end-to-end tests
First make sure that you have your input data ready and accessible on your machine. If you have no input data yet obtain it from one of our recommended sources or create your own knowledge base in standard NTriple or Turtle formats and (optionally) add a text corpus.
Note that QLever only accepts UTF-8 encoded input files. Then again you should be using UTF-8 anyway
By default and when running docker
without user namespaces, the container
will use the user ID 1000 which on Linux is almost always the first real user.
If the default user does not work add -u "$(id -u):$(id -g)"
to docker run
so that QLever executes as the current user.
When running docker
with user namespaces you may need to make the index
folder accessible to the user the QLever process is mapped to on the host (e.g.
nobody, see /etc/subuid
)
chmod -R o+rw ./index
Then proceed with creating an index.
Important: Ensure that you have enough disk space where your ./index
folder resides or see below for using a separate path
To build a new index run a bash inside the QLever container as follows
docker run -it --rm \
-v "<absolute_path_to_input>:/input" \
-v "$(pwd)/index:/index" --entrypoint "bash" qlever
If you want to use a separate path you MUST change the "$(pwd)/index
part in all docker …
commands and replace it with the absolute path to
your index.
From now on we are inside the container, make sure you follow all the coming instructions for creating an index and only then proceed to the next section.
If your input knowledge base is in the standard NTriple or Turtle format create the index with the following command
IndexBuilderMain -l -i /index/<prefix> -f /input/knowledge_base.ttl
Where <prefix>
is the base name for all index files and -l
externalizes long literals to disk.
If you use index
as the prefix you can later skip the -e INDEX_PREFIX=<prefix>
flag.
To include a text collection, the wordsfile and docsfiles (see
here for the required format) is provided with the
-w
and -d
flags respectively.
Then the full command will look like this:
IndexBuilderMain -l -i /index/<prefix> -f /input/knowledge_base.ttl \
-w /input/wordsfile.tsv -d /input/docsfile.tsv
You can also add a text index to an existing knowledge base index by adding the
-A
flag and ommitting the -f
flag.
To run a QLever server container use the following command.
docker run -it -p 7001:7001 \
-v "$(pwd)/index:/index" \
-e INDEX_PREFIX=<prefix> \
--name qlever \
qlever
Where additional arguments can be added at the end of the command. If you want
the container to run in the background and restart automatically replace -it
with -d --restart=unless-stopped
The quickest way to run queries is to use the minimal web interface, available at the port specified above (7001 in the example). For a more advanced web interface you can use the QLever UI.
Queries can also be executed from the command line using curl
curl 'http://localhost:7001/?query=SELECT ?x WHERE {?x <rel> ?y}'