Skip to content

Conversation

@tonyd3
Copy link

@tonyd3 tonyd3 commented Aug 5, 2025

Iterator Tree Simplification and Query Parser Optimizations

This pull request introduces significant optimizations to the query iterator tree construction and the query parser. The primary goal is to simplify complex iterator combinations into more efficient forms (e.g., replacing an intersection with an empty child with an empty iterator) and to optimize query parsing by eliminating redundant negations. These changes aim to improve query performance and reduce memory overhead by creating more streamlined iterator execution plans.

Key Changes

• Implemented iterator reduction rules for IntersectionIterator, UnionIterator, NotIterator, OptionalIterator, and HybridVectorIterator to simplify the iterator tree based on child types (e.g., empty, null, wildcard).
• Introduced a IsWildcardIterator helper function and an isWildcard flag within InvIndIterator to correctly identify iterators that behave like wildcards, enabling more aggressive optimizations.
• Optimized both v1 and v2 query parsers to eliminate double negations (e.g., ```NOT(NOT``(A))becomesA`), improving query tree efficiency.
• Added a ``notSubtree`` flag to ``QueryEvalCtx`` to influence the behavior of ``UnionIterator`` when it is part of a ``NOT`` subtree, allowing for `quick exit` optimizations.
• Updated ``NewWildcardIterator_NonOptimized`` and ``NewInvIndIterator_GenericQuery`` to accept a `weight` parameter, ensuring consistent weight propagation.
• Extensively updated C++ unit tests across multiple iterator types to validate the new reduction logic and ensure correct memory management and behavior in various edge cases.

Affected Areas

• src/iterators/ (intersection_iterator.c, union_iterator.c, not_iterator.c, optional_iterator.c, wildcard_iterator.c, inverted_index_iterator.c, inverted_index_iterator.h, wildcard_iterator.h)
• src/query_parser/ (v1/parser.c, v1/parser.y, v2/parser.c, v2/parser.y)
• src/query.c
• src/hybrid_reader.c
• src/query_ctx.h
• tests/cpptests/ (test_cpp_iterator_intersection.cpp, test_cpp_iterator_union.cpp, test_cpp_iterator_not.cpp, test_cpp_iterator_optional.cpp, test_cpp_index.cpp, test_cpp_query.cpp, test_cpp_iterator_index.cpp, micro-benchmarks/benchmark_wildcard_non_optimized_iterator.cpp, micro-benchmarks/benchmark_index_iterator.cpp)
• tests/pytests/ (test_issues.py, test_profile.py)


This summary was automatically generated by @propel-code-bot

…tors (RediSearch#6371)

* add some optimizations to simplify some patterns in iterators

* add optimization for HybridKNNIndexIterator

* fix test

* fix couple of tests

* query parser handle double negation

* add test for double negation in Query Parser

* move implementation to the new Iterators constructors

* move work to actual constructor

* test the reductions in the iterators

* every union in Not subtree can run in quickMode

* improve PR coverage

* first set of PR review

* second iteration reviews. Intersection iterator and hybrid tests

* handle sanitizer error

* last comment
@tonyd3 tonyd3 changed the title [MOD-10184] [MOD-10184] add some optimizations to simplify some patterns in iterators Aug 5, 2025
Comment on lines +450 to +455
} else if (hParams->childIt && hParams->childIt->type == WILDCARD_ITERATOR) {
//TODO: When new Iterator API (consider READER_ITERATOR with isWildcard flag)
hParams->childIt->Free(hParams->childIt);
hParams->qParams.searchMode = VECSIM_STANDARD_KNN;
hParams->childIt = NULL;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

This check for WILDCARD_ITERATOR does not handle optimized wildcard iterators, which are of type READ_ITERATOR with a special flag. This can lead to missed optimizations for hybrid queries.

You've introduced IsWildcardIterator() to solve this, and it's used in the other new reducer functions. This reducer should also use IsWildcardIterator(hParams->childIt) to correctly handle all types of wildcard iterators. You'll also need to include iterators/wildcard_iterator.h at the top of the file.

Suggested change
} else if (hParams->childIt && hParams->childIt->type == WILDCARD_ITERATOR) {
//TODO: When new Iterator API (consider READER_ITERATOR with isWildcard flag)
hParams->childIt->Free(hParams->childIt);
hParams->qParams.searchMode = VECSIM_STANDARD_KNN;
hParams->childIt = NULL;
}
} else if (hParams->childIt && IsWildcardIterator(hParams->childIt)) {
//TODO: When new Iterator API (consider READER_ITERATOR with isWildcard flag)
hParams->childIt->Free(hParams->childIt);
hParams->qParams.searchMode = VECSIM_STANDARD_KNN;
hParams->childIt = NULL;
}

Committable suggestion

Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Copy Context

Comment on lines 182 to 190
excepted_res = ['Type', 'INTERSECT', 'Counter', 0, 'Child iterators', [
None,
['Type', 'TEXT', 'Term', 'world', 'Counter', 0, 'Size', 1],
['Type', 'TEXT', 'Term', 'hello', 'Counter', 0, 'Size', 2]]]
res3 = env.cmd('FT.PROFILE', 'idx', 'SEARCH', 'QUERY', 'hello new world')
None,
['Type', 'TEXT', 'Term', 'world', 'Counter', 0, 'Size', 1],
['Type', 'TEXT', 'Term', 'hello', 'Counter', 0, 'Size', 2]]]

# excepted_res = ['Type', 'EMPTY', 'Counter', 0] (After Optimization and new iterators are used, it should change to this)

res3 = env.cmd('FT.PROFILE', 'idx', 'SEARCH', 'QUERY', 'hello new world')
env.assertEqual(res3[1][1][0][3], excepted_res)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[TestCoverage]

The comment here indicates that with the new optimizations, the expected result should be an EMPTY iterator profile, because an intersection with a non-existent term (new) should be reduced to an empty set. The new IntersectionIteratorReducer implements this logic.

However, the test still asserts the old, unoptimized behavior. Please update this test (and a similar one in test_profile.py) to assert the new, correct, optimized behavior. This will ensure the optimizations are working as intended and prevent future regressions.

Copy Context

Comment on lines 431 to 437
['Type', 'INTERSECT', 'Counter', 1, 'Child iterators',
[['Type', 'TEXT', 'Term', 'foo', 'Counter', 1, 'Size', 1],
['Type', 'NOT', 'Counter', 1, 'Child iterator',
['Type', 'NOT', 'Counter', 1, 'Child iterator',
['Type', 'EMPTY', 'Counter', 0]]]],
#['Type', 'WILDCARD', 'Counter', 1]]], (After Optimization and new iterators are used, NOT ITERATOR should change to this)
'Result processors profile',
[['Type', 'Index', 'Counter', 1],

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[TestCoverage]

The comment here indicates that with the new optimizations, the NOT iterator should be reduced to a WILDCARD iterator because its child is an EMPTY iterator (-@t:baz where baz doesn't exist). The new NotIteratorReducer implements this logic.

However, the test still asserts the old, unoptimized behavior. Please update this test to assert the new, correct, optimized behavior to ensure the optimizations are working as intended.

Copy Context

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants