Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Use the Lucene Distance Calculation Function in Script Scoring for doing exact search #1287

Closed
wants to merge 4 commits into from

Conversation

TrungBui59
Copy link

Description

Change our implementation to use Lucene for calculating distance

Issues Resolved

#1032

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

codecov bot commented Nov 3, 2023

Codecov Report

Merging #1287 (dd405eb) into main (4d6cc16) will decrease coverage by 0.06%.
Report is 1 commits behind head on main.
The diff coverage is 87.50%.

@@             Coverage Diff              @@
##               main    #1287      +/-   ##
============================================
- Coverage     85.07%   85.02%   -0.06%     
  Complexity     1210     1210              
============================================
  Files           160      160              
  Lines          4930     4920      -10     
  Branches        449      449              
============================================
- Hits           4194     4183      -11     
- Misses          537      540       +3     
+ Partials        199      197       -2     
Files Coverage Δ
...g/opensearch/knn/plugin/script/KNNScoringUtil.java 92.64% <87.50%> (-6.08%) ⬇️

... and 2 files with indirect coverage changes

@navneet1v
Copy link
Collaborator

@TrungBui59 Can you add the entry for this in Changelog.md file

@@ -294,11 +295,7 @@ public static float lInfNorm(List<Number> queryVector, KNNVectorScriptDocValues
*/
public static float innerProduct(float[] queryVector, float[] inputVector) {
requireEqualDimension(queryVector, inputVector);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this check is present in Lucene code, we can remove it just like we removed from l2Squared

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@navneet1v Yes, I would love to start with something small.

Copy link
Collaborator

@navneet1v navneet1v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall code looks fine to me. @jmazanec15 as you are aware more around the distances on both lucene and k-NN side. Can you please review and see if things are intact.

}
try {
cosine = VectorUtil.cosine(queryVector, inputVector);
} catch (Exception e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid catching all exceptions here or catch a more specific exception?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmazanec15 Sure I will change it

@navneet1v
Copy link
Collaborator

@TrungBui59 can you resolve the comments so that we can merge the changes

@navneet1v
Copy link
Collaborator

@TrungBui59 are you still working on this PR?

@navneet1v
Copy link
Collaborator

@TrungBui59 are you still working on this? if not can we close this PR or someone else can pick up the github issue?

@aalbahem
Copy link

@navneet1v I would love to pick up this.

@navneet1v
Copy link
Collaborator

@aalbahem thanks for showing the interest. I can see a PR is already raised for the same: #1699.

Let me know if you would be interested in picking up something else.

@navneet1v
Copy link
Collaborator

Closing this PR due to no further activity from author. Also there is a PR raised for similar feature. Ref: #1699

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants