When does ElasticSearch not return all relevant vectors? #2445

a3957273 · 2023-08-08T11:01:40Z

a3957273
Aug 8, 2023

On the README it mentions:

Qdrant enables JSON payloads to be associated with vectors, providing both storage and filtering based on payload values. It supports various combinations of should, must, and must_not conditions, ensuring retrieval of all relevant vectors unlike ElasticSearch post-filtering.

Can you clarify what this means? Under what circumstance does ElasticSearch post-filtering not return all relevant vectors?

timvisee · 2023-08-08T12:31:06Z

timvisee
Aug 8, 2023
Maintainer

Here's a simplified explanation:

In Qdrant, payloads are integrated into the graph search. That means that when a user queries with filtering, the graph is traversed with filtering. Resulting vectors found during the graph traversal all match the filter, meaning all retrieved points are relevant.

In ElasticSearch, filtering is not part of the graph traversal. ES tries to find the best results in the whole collection from the graph, without taking filters into account. Many of the retrieved points may not be relevant. The retrieved results are filtered afterwards to drop all points not matching the filter. This gives lower quality results.

Since both use forms of nearest neighbor algorithms, it is important that matching (relevant) points are found during graph traversal. If that isn't the case (when not filtering for example), other points with a good filter-match may never be reached.

Please read this article for more details on this: https://qdrant.tech/articles/filtrable-hnsw/

This problem occurs because a (graph based) index is used for searching. This wouldn't happen with full-scan search, but that is very slow on large collections.

Note that it is called post-filtering because filtering is not part of the graph traversal, filtering is done after graph traversal instead, thus "post-filtering".

I hope that makes sense. 😄

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qdrant

When does ElasticSearch not return all relevant vectors? #2445

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Qdrant

When does ElasticSearch not return all relevant vectors? #2445

a3957273 Aug 8, 2023

Replies: 1 comment

timvisee Aug 8, 2023 Maintainer

a3957273
Aug 8, 2023

timvisee
Aug 8, 2023
Maintainer