-
Notifications
You must be signed in to change notification settings - Fork 0
Refactor: Replace IndexIterator with unified QueryIterator API #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| static void insertResultToHeap_Aggregate(HybridIterator *hr, RSIndexResult *child_res, | ||
| RSIndexResult *vec_res, double *upper_bound) { | ||
|
|
||
| AggregateResult_AddChild(res, vec_res); | ||
| AggregateResult_AddChild(res, child_res); | ||
| RSIndexResult *hit = IndexResult_DeepCopy(res); | ||
| IndexResult_ResetAggregate(res); // Reset the current result. | ||
| ResultMetrics_Add(hit, hr->base.ownKey, RS_NumVal(vec_res->data.num.value)); | ||
| RSIndexResult *res = NewHybridResult(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
Potential memory leak in insertResultToHeap_Aggregate
The function creates copies using IndexResult_DeepCopy() and marks them as copies, but if mmh_insert() or mmh_exchange_max() fail, these copied results could leak. Consider adding error handling.
static void insertResultToHeap_Aggregate(HybridIterator *hr, RSIndexResult *child_res,
RSIndexResult *vec_res, double *upper_bound) {
RSIndexResult *res = NewHybridResult();
if (!res) return; // Handle allocation failure
RSIndexResult *vec_copy = IndexResult_DeepCopy(vec_res);
RSIndexResult *child_copy = IndexResult_DeepCopy(child_res);
if (!vec_copy || !child_copy) {
IndexResult_Free(res);
IndexResult_Free(vec_copy);
IndexResult_Free(child_copy);
return;
}
AggregateResult_AddChild(res, vec_copy);
AggregateResult_AddChild(res, child_copy);
// ... rest of function
}| IteratorStatus status = iter->SkipTo(iter, startId); | ||
|
|
||
| // Continue reading the rest | ||
| while (status == ITERATOR_OK) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
There's a potential logic error here. If iter->SkipTo(iter, startId) returns ITERATOR_NOTFOUND, the while loop will not be executed, even though the iterator is positioned at the next valid element. This will cause documents added since the fork to be missed by the HLL update.
The loop should execute if status is ITERATOR_OK or ITERATOR_NOTFOUND after the SkipTo call. You can normalize ITERATOR_NOTFOUND to ITERATOR_OK before the loop, as the iterator is already positioned at the correct element to start reading from.
|
|
||
| IndexResult_ConcatMetrics(*vec_res, child_res); // Pass child metrics, if there are any | ||
| ResultMetrics_Add(*vec_res, hr->base.ownKey, RS_NumVal((*vec_res)->data.num.value)); | ||
| ResultMetrics_Add(*vec_res, hr->ownKey, RS_NumVal((*vec_res)->data.num.value)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
The HybridIterator struct has its own ownKey field, which shadows the ownKey field in the embedded QueryIterator base. The ownKey is set on base.ownKey during query parsing, but here you are accessing hr->ownKey, which is uninitialized. This will likely lead to a crash or incorrect behavior when adding metrics.
| AggregateResult_AddChild(res, IndexResult_DeepCopy(vec_res)); | ||
| AggregateResult_AddChild(res, IndexResult_DeepCopy(child_res)); | ||
| res->isCopy = true; // Mark as copy, so when we free it, it will also free its children. | ||
| ResultMetrics_Add(res, hr->ownKey, RS_NumVal(vec_res->data.num.value)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor Iterator API and Remove Old Iterators
This pull request introduces a significant refactoring of the core iterator API, replacing the legacy
IndexIteratorwith a new, unifiedQueryIteratorinterface. This change aims to streamline iteration patterns and remove deprecated code, resulting in substantial deletions across various modules. The refactoring also includes the removal of theConcurrentSearchCtxkey management, simplifying concurrency handling within the system.Key Changes
• Replaced
IndexIteratorwithQueryIteratoracross core modules, includinginverted_index,hybrid_reader,geo_index, andaggregate.• Removed
IndexReaderand its associated functions (IR_Read,IR_SkipTo,IR_Free,IR_Rewind) frominverted_index`.c` andinverted_index.h.• Removed
IdListIteratorand its related functions, leading to the deletion ofsrc/id_list.c.• Removed the old
index_iterator`.h` header, replaced by `src/iterators/`iterator_api`.h`. • RemovedConcurrentSearchCtxandConcurrentKeyCtxstructures and their management functions, simplifying concurrent key handling. • UpdatedHybridIteratorto fully utilize the newQueryIteratorinterface, including changes to its `Read` andSkipTologic and internal state management. • Renamed and adapted `geometry/`QueryIteratortoCPPQueryIteratorto align with the new C-styleQueryIteratorbase struct.• Adjusted
DocTable's field expiration predicate checks, removingDocTable_HasExpirationandDocTable_VerifyFieldExpirationPredicatein favor ofDocTable_CheckFieldExpirationPredicate.• Modified debug commands (
DumpInvertedIndex,DumpNumericIndex,DumpTagIndex) to use the newQueryIteratorfor displaying index contents.• Updated
AREQ(Aggregate Request) structure and pipeline building logic to integrate with the newQueryIteratoras the root iterator.Affected Areas
• src/
inverted_index/ (inverted_index.c,inverted_index.h)• src/iterators/ (
hybrid_reader.c,empty_iterator.c,empty_iterator.h,iterator_api.h)• src/
id_list.c (deleted)• src/
index_iterator.h (deleted)• src/index.h (deleted)
• src/geometry/ (
query_iterator.cpp,query_iterator.hpp,geometry_api.cpp,geometry_api.h)• src/
concurrent_ctx.c, src/concurrent_ctx.h• src/
debug_commands.c• src/
geo_index.c, src/geo_index.h• src/
doc_table.c, src/doc_table.h• src/aggregate/ (aggregate.h,
aggregate_request.c,aggregate_exec.c)• src/indexer.c
This summary was automatically generated by @propel-code-bot