-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor scoring to map leaf reader context with results #2271
base: main
Are you sure you want to change the base?
Conversation
We map order of results to order of segments, and finally rely on that order to build top docs. Refactor method to use map.Entry to map leafreader context with results from those leaves. This is required when we want to split segments based on approx search or exact search to reduce rescoring twice by exact search Signed-off-by: Vijayan Balasubramanian <[email protected]>
3504312
to
a8b3c1d
Compare
Do you have a follow up code showing why this is needed? Can't we always get the relation between leaf reader and the result based on their index in the list? |
Sure. #2273 . This is draft PR. I want to test recall before making it ready. |
Thanks. I think this refactoring should be accompanied with the final PR instead of pushing it separately. |
Having multiple independent PR which doesn’t break existing feature helps to get reviewed faster. If I add refactoring and new implementation, it might be hard to find out bugs |
The challenge is that it's difficult to justify this refactoring without the actual PR that requires it. There's always a possibility that the implementation in the subsequent PR may change, in which case this refactoring might end up providing no real benefit. |
Fair enough. How about this? I will backport to 2.x only if both PRs are merged. Will that work? |
I am against merging it in main. Doesn't matter we backport it to 2.x or not. The change is not big so I think it should be okay to have this with following PR together. |
I don't see a reason why it is absolute necessary that this feature cannot be broken into two PR with predefined scope, where each does two different thing. Can you give reason why you are against breaking into two PRs by pointing out how it breaks either build or existing feature? |
Thank you for your effort in keeping the PR small to make the review process easier—I really appreciate that. My concern is that the value of this PR depends heavily on the approval of the subsequent PR, as it primarily supports those changes. If the following PR undergoes modifications, this one may also need to be adjusted accordingly. By the way, I didn't say it will breaks either build or existing feature. |
for (LeafReaderContext leafReaderContext : leafReaderContexts) { | ||
tasks.add(() -> searchLeaf(leafReaderContext, knnWeight, k)); | ||
} | ||
return indexSearcher.getTaskExecutor().invokeAll(tasks); | ||
} | ||
|
||
private List<Map<Integer, Float>> doRescore( | ||
private List<Map.Entry<LeafReaderContext, Map<Integer, Float>>> doRescore( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This data structure is getting complex, can we look into simplifying it in favor of readability?
Description
We map order of results to order of segments, and finally rely on that order to build top docs. Refactor method to use map.Entry to map leaf reader context with results from those leaves.
This is required when we want to split segments based on approx search or exact search to reduce rescoring twice by exact search
Related Issues
Prerequisite for #2215
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.