[Latency Improvement Oppurtunity] Improve filter search performance when filter results is matching all the documents in the segment #1387
Labels
Enhancements
Increases software capabilities beyond original client specifications
Description
Currently when filtered vector search is run, we run the filters to find out the relevant filter ids and see if we can do ANN search or exact search for each segment. The identified filterIds are passed to ANN search so that we can do filter while searching.
But in case when lets say filter matched all the docIds in the segment, we can avoid few things:
2, is very very cheap(as it has bunch of if else check) but 1 can add latency(exact latency is not identified as of now) if the segment is large enough because we need to iterate over the bitmap to find the docids and pass it till the C++ layer. There are some conversion that happen in c++ layer ref: https://github.com/opensearch-project/k-NN/blob/main/jni/src/faiss_wrapper.cpp#L465-L490.
Solution
We can avoid all this computation by adding a simple check to see if max doc and filter bitmap size is same. If both of them are same, we can just to simple ANN search.
The text was updated successfully, but these errors were encountered: