-
Notifications
You must be signed in to change notification settings - Fork 0
[MOD-10184] add some optimizations to simplify some patterns in iterators #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 6371_base
Are you sure you want to change the base?
Conversation
…tors (RediSearch#6371) * add some optimizations to simplify some patterns in iterators * add optimization for HybridKNNIndexIterator * fix test * fix couple of tests * query parser handle double negation * add test for double negation in Query Parser * move implementation to the new Iterators constructors * move work to actual constructor * test the reductions in the iterators * every union in Not subtree can run in quickMode * improve PR coverage * first set of PR review * second iteration reviews. Intersection iterator and hybrid tests * handle sanitizer error * last comment
| } else if (hParams->childIt && hParams->childIt->type == WILDCARD_ITERATOR) { | ||
| //TODO: When new Iterator API (consider READER_ITERATOR with isWildcard flag) | ||
| hParams->childIt->Free(hParams->childIt); | ||
| hParams->qParams.searchMode = VECSIM_STANDARD_KNN; | ||
| hParams->childIt = NULL; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
This check for WILDCARD_ITERATOR does not handle optimized wildcard iterators, which are of type READ_ITERATOR with a special flag. This can lead to missed optimizations for hybrid queries.
You've introduced IsWildcardIterator() to solve this, and it's used in the other new reducer functions. This reducer should also use IsWildcardIterator(hParams->childIt) to correctly handle all types of wildcard iterators. You'll also need to include iterators/wildcard_iterator.h at the top of the file.
| } else if (hParams->childIt && hParams->childIt->type == WILDCARD_ITERATOR) { | |
| //TODO: When new Iterator API (consider READER_ITERATOR with isWildcard flag) | |
| hParams->childIt->Free(hParams->childIt); | |
| hParams->qParams.searchMode = VECSIM_STANDARD_KNN; | |
| hParams->childIt = NULL; | |
| } | |
| } else if (hParams->childIt && IsWildcardIterator(hParams->childIt)) { | |
| //TODO: When new Iterator API (consider READER_ITERATOR with isWildcard flag) | |
| hParams->childIt->Free(hParams->childIt); | |
| hParams->qParams.searchMode = VECSIM_STANDARD_KNN; | |
| hParams->childIt = NULL; | |
| } |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
| excepted_res = ['Type', 'INTERSECT', 'Counter', 0, 'Child iterators', [ | ||
| None, | ||
| ['Type', 'TEXT', 'Term', 'world', 'Counter', 0, 'Size', 1], | ||
| ['Type', 'TEXT', 'Term', 'hello', 'Counter', 0, 'Size', 2]]] | ||
| res3 = env.cmd('FT.PROFILE', 'idx', 'SEARCH', 'QUERY', 'hello new world') | ||
| None, | ||
| ['Type', 'TEXT', 'Term', 'world', 'Counter', 0, 'Size', 1], | ||
| ['Type', 'TEXT', 'Term', 'hello', 'Counter', 0, 'Size', 2]]] | ||
|
|
||
| # excepted_res = ['Type', 'EMPTY', 'Counter', 0] (After Optimization and new iterators are used, it should change to this) | ||
|
|
||
| res3 = env.cmd('FT.PROFILE', 'idx', 'SEARCH', 'QUERY', 'hello new world') | ||
| env.assertEqual(res3[1][1][0][3], excepted_res) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[TestCoverage]
The comment here indicates that with the new optimizations, the expected result should be an EMPTY iterator profile, because an intersection with a non-existent term (new) should be reduced to an empty set. The new IntersectionIteratorReducer implements this logic.
However, the test still asserts the old, unoptimized behavior. Please update this test (and a similar one in test_profile.py) to assert the new, correct, optimized behavior. This will ensure the optimizations are working as intended and prevent future regressions.
| ['Type', 'INTERSECT', 'Counter', 1, 'Child iterators', | ||
| [['Type', 'TEXT', 'Term', 'foo', 'Counter', 1, 'Size', 1], | ||
| ['Type', 'NOT', 'Counter', 1, 'Child iterator', | ||
| ['Type', 'NOT', 'Counter', 1, 'Child iterator', | ||
| ['Type', 'EMPTY', 'Counter', 0]]]], | ||
| #['Type', 'WILDCARD', 'Counter', 1]]], (After Optimization and new iterators are used, NOT ITERATOR should change to this) | ||
| 'Result processors profile', | ||
| [['Type', 'Index', 'Counter', 1], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[TestCoverage]
The comment here indicates that with the new optimizations, the NOT iterator should be reduced to a WILDCARD iterator because its child is an EMPTY iterator (-@t:baz where baz doesn't exist). The new NotIteratorReducer implements this logic.
However, the test still asserts the old, unoptimized behavior. Please update this test to assert the new, correct, optimized behavior to ensure the optimizations are working as intended.
Iterator Tree Simplification and Query Parser Optimizations
This pull request introduces significant optimizations to the query iterator tree construction and the query parser. The primary goal is to simplify complex iterator combinations into more efficient forms (e.g., replacing an intersection with an empty child with an empty iterator) and to optimize query parsing by eliminating redundant negations. These changes aim to improve query performance and reduce memory overhead by creating more streamlined iterator execution plans.
Key Changes
• Implemented iterator reduction rules for
IntersectionIterator,UnionIterator,NotIterator,OptionalIterator, andHybridVectorIteratorto simplify the iterator tree based on child types (e.g., empty, null, wildcard).• Introduced a
IsWildcardIteratorhelper function and anisWildcardflag withinInvIndIteratorto correctly identify iterators that behave like wildcards, enabling more aggressive optimizations.• Optimized both v1 and v2 query parsers to eliminate double negations (e.g., ```NOT
(NOT``(A))becomesA`), improving query tree efficiency.• Added a ``notSubtree`` flag to ``QueryEvalCtx`` to influence the behavior of ``UnionIterator`` when it is part of a ``NOT`` subtree, allowing for `quick exit` optimizations.
• Updated ``NewWildcardIterator_NonOptimized`` and ``NewInvIndIterator_GenericQuery`` to accept a `weight` parameter, ensuring consistent weight propagation.
• Extensively updated C++ unit tests across multiple iterator types to validate the new reduction logic and ensure correct memory management and behavior in various edge cases.
Affected Areas
• src/iterators/ (
intersection_iterator.c,union_iterator.c,not_iterator.c,optional_iterator.c,wildcard_iterator.c,inverted_index_iterator.c,inverted_index_iterator.h,wildcard_iterator.h)• src/
query_parser/ (v1/parser.c, v1/parser.y, v2/parser.c, v2/parser.y)• src/query.c
• src/
hybrid_reader.c• src/
query_ctx.h• tests/cpptests/ (
test_cpp_iterator_intersection.cpp,test_cpp_iterator_union.cpp,test_cpp_iterator_not.cpp,test_cpp_iterator_optional.cpp,test_cpp_index.cpp,test_cpp_query.cpp,test_cpp_iterator_index.cpp, micro-benchmarks/benchmark_wildcard_non_optimized_iterator.cpp, micro-benchmarks/benchmark_index_iterator.cpp)• tests/pytests/ (
test_issues.py,test_profile.py)This summary was automatically generated by @propel-code-bot