Eliminate duplicate words components for Apache Lucene/Solr Please use the following field type definitions. Remove duplicate words <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="org.apache.lucene.EliminateDuplicateFilterFactory" /> </analyzer> </fieldType> Result Input Output text word word text word word text word Custom PositionFilterFactory <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="org.apache.lucene.PositionFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> </analyzer> </fieldType> Result Input Output text word word text word word text word