Skip to content
rdelbru edited this page Dec 2, 2011 · 15 revisions

FAQ

What is SIREn ?

SIREn is an extension for Apache Lucene and Solr. SIREn adds new features to Lucene and Solr for processing and searching highly heterogeneous semi-structured data (e.g., RDF). In essence, SIREn adds a new "Field Type" with a set of specific tools such as Analyzers, Query Operators and Query Parser. If you were looking for a way to:

  • have a real schema-less solution, i.e., you don't have to define all of your fields ahead of time, without penalties on the system performance;
  • search efficiently over millions of fields;
  • create a Lucene Document containing sub-element (i.e., nested child elements)

then SIREn might be a solution for you.

SIREn extends Lucene and Solr, meaning that you can still use all the features of Solr in conjunction with the features provided by SIREn.

What query types SIREn currently supports ?

As SIREn introduces a new "Field Type" with a different data model than what can be found in Lucene, SIREn needs its own implementation of each query type supported by Lucene. Currently, most of the core query types that can be found in Lucene have been implemented for SIREn. The table below summarises the current status.

Query Types SIREn Lucene
Boolean Query Yes Yes
Phrase Query Yes Yes
Proximity (Span) Query No Yes
Wildcard Query Yes Yes
Prefix Query Yes Yes
FuzzyQuery Yes Yes
Range Query Yes Yes
Numeric Range Query Yes Yes

In addition, SIREn provides new query types such as Tuple Query and Cell Query. These new query types allows more complex "structured query" than what Lucene proposes. For example, by using these query types, it is possible now to perform efficient search over an unlimited number of fields, or to perform queries over nested child elements.

The SIREn query types are compatible with the Lucene Boolean query type, i.e., you can combine SIREn query types using the Lucene BooleanQuery.

In the future, SIREn will propose new query types that are similar to XPath, such as the Parent Child query types.

Does SIREn support Ranked Search ?

Yes, SIREn returns a list of results that are automatically ranked based on their relevance to your query.

Does SIREn support Faceted Search ?

Yes, you can create arbitrary facets using SIREn's query with the Solr Query Faceting feature.

Does SIREn support Highlighting ?

Yes, SIREn support highlighting.

Does SIREn support sorting on a particular value ?

At the momnet, SIREn does not support sorting on a particular value of a SIREn field. This might be supported in a future release. However, you can still use sorting on a Lucene field.

Is SIREn language agnostic ?

Yes. Similarly to Lucene, language agnostic search can be achieved by carefully designing your indexing and querying analysis pipeline, e.g., by using appropriate word stemming filters.

In addition, SIREn provides more flexibility than Lucene/Solr for such a task. In Lucene, a field is restricted to have one single analyzer. In SIREn, you can associate one analyzer per field value, i.e., SIREn allows to associate more than one analyzer for one single field. You can then have one field with multiple values, each one in a different language, and you can configure SIREn to use a different analyzer based on the language of the value.