-
Notifications
You must be signed in to change notification settings - Fork 18
Querying
Once you have the indexes built you can setup the WebApp or use the MG4J query command line class it.unimi.di.mg4j.query.Query to query them. You may find the query-index.sh
script helpful if you're interested in using the command line.
See the MG4J Query documentation for an explanation of MG4J syntax.
To avoid storing full Resource URLs and BNodes for each document, during indexing all Resource URLs and BNodes are replaced by the Minimal Perfect Hash value, the ResourceId. The values are prefixed by a user defined string to distinguish them from regular text terms. The default prefix is @
. While in the web app, there is a query preprocessing step to convert {resource}
or {_:BNodeCode}
sub strings to their corresponding @ResourceId
, querying from the command line still requieres that the user knows the ResourceId.
###Query Examples
First lets start with simple query as would be entered in the Web App:
type:{http://schema.org/Article} name:job
This would return all documents(RDF Subjects) with a type(http://www.w3.org/1999/02/22-rdf-syntax-ns#type
) of http://schema.org/Article
and the text job
as part of their http://schema.org/name
.
Note that in the web app, as well as the {resource}
preprocessing. The predicate names are shorten. type
-> http://www.w3.org/1999/02/22-rdf-syntax-ns#type
etc. Additionally, you may have noticed that the vertical index names are derived from the predicate URL's http://schema.org/Article
-> http_schema_org_Article
.
So assuming that the ResourceID of http://schema.org/Article
is 42, the same query from the command line would be:
http_www_w3_org_1999_02_22_rdf_syntax_ns_type:@42 http_schema_org_name:job
Currently querying is quiet cumbersome(especially from the command line). There are a few things that we need to change to make it more usable. Two of the most important being the indexing of range types(numbers, dates, currencies etc.) and also the ability to do simple join operations.