-
Notifications
You must be signed in to change notification settings - Fork 36
Solr Query Parser
This document explains how to register the SIREn query parser within Solr and describes its features.
The SIREn query parser provides a common interface for issuing keyword queries, NTriple queries, filter queries, or a combination of them over a SIREn's field. It provides a set of parameters to change its behaviour.
The SIREn query parser must be registered in the solrconfig.xml file like this:
<!-- Example of Registration of the siren query parser. -->
<queryParser name="siren" class="org.sindice.siren.solr.SirenQParserPlugin"/>
In this example, this query parser is named siren
and will be accessible by the SIREn request handler that we will define next.
Next, you can define a Solr request handler, which will be named siren
, for the SIREn query parser like this:
<requestHandler name="siren" class="solr.StandardRequestHandler">
<!-- default values for query parameters -->
<lst name="defaults">
<str name="defType">siren</str>
<str name="echoParams">explicit</str>
<!-- Disable field query in keyword parser -->
<str name="disableField">true</str>
<str name="qf">
ntriple^1.0 url^1.2
</str>
<str name="nqf">
ntriple^1.0
</str>
<!-- the NTriple query multi-field operator:
- disjunction: the query must match in at least one of the fields
- scattered: each NTriple query pattern must match in at least one of the fields
-->
<str name="nqfo">scattered</str>
<str name="fl">id</str>
</lst>
</requestHandler>
The defType
parameter refers to the name of the query parser to use. In this example, the name of the query parser we want to use is siren
, as defined in the previous section.
A second important parameter is disableField
. When this parameter is set to true, the Lucene's field query is disabled in order to avoid ambiguity with QNames. This is only necessary if you are using the QNameFilterFactory.
These parameters allow you to define a list of fields and the "boosts" to associate with each of them when building SIREn queries from the user's query. The query will be automatically expanded to all these fields. The format supported is fieldOne^2.3 fieldTwo fieldThree^0.4, which indicates that fieldOne has a boost of 2.3, fieldTwo has the default boost, and fieldThree has a boost of 0.4. This indicates that matches in fieldOne are much more significant than matches in fieldTwo, which are more significant than matches in fieldThree.
This parameter allows you to change the behaviour of the NTriple query parser. In certain cases, when you have multiple SIREn fields, you might want to distribute the NTriple query across these multiple fields. With the disjunction
option, the NTriple query must match at least one of the fields. With the scattered
option, all the pattern of the NTriple query must match at least one field, but each pattern is allowed to match a different field.
For example, let imagine you have a document with two NTriple fields, explicit and implicit:
<doc>
<field name="explicit">
<http://renaud.delbru.fr/rdf/foaf#me> <http://xmlns.com/foaf/0.1/name> "Renaud Delbru" .
</field>
<field name="implicit">
<http://renaud.delbru.fr/rdf/foaf#me> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .
</field>
</doc>
The following NTriple query
* <foaf:name> "renaud delbru"
AND
* <rdf:type> <foaf:person>
will not match the document with the disjunction
option, but will match the document with the scattered
option.