Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add binary index support for Lucene engine #2292

Merged
merged 1 commit into from
Dec 14, 2024

Conversation

jed326
Copy link
Contributor

@jed326 jed326 commented Nov 27, 2024

Description

Add binary vector support for Lucene engine. For testing I parameterized BinaryIndexIT.java to run with both FAISS and LUCENE engines.

Related Issues

Resolves #1857

Check List

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@jed326
Copy link
Contributor Author

jed326 commented Nov 27, 2024

@heemin32 @navneet1v @jmazanec15 could you please help review? Also it looks like the linux build is broken.

@jed326 jed326 force-pushed the binary-vector-lucene branch from 882a672 to 0ab5494 Compare November 27, 2024 23:48
@navneet1v
Copy link
Collaborator

@peterzhuamazon are there changes in the CI image? we are seeing failures due to GlibC

@peterzhuamazon
Copy link
Member

@peterzhuamazon are there changes in the CI image? we are seeing failures due to GlibC

Hi @navneet1v you can follow the guide here to mitigate the issue:

Please let me know if you have more questions, thanks!

@navneet1v
Copy link
Collaborator

@jed326 I raised a PR for the fix: #2303

@navneet1v
Copy link
Collaborator

@jed326 please pull in latest changes from main branch. CI setup fix is done.

@jed326 jed326 force-pushed the binary-vector-lucene branch 3 times, most recently from 0f768ed to 313a1f2 Compare December 12, 2024 04:29
@jed326
Copy link
Contributor Author

jed326 commented Dec 12, 2024

@navneet1v thanks for the feedback, mind taking another look after the latest revisions?

@jed326 jed326 force-pushed the binary-vector-lucene branch from 313a1f2 to a4131fc Compare December 12, 2024 19:56
@jed326
Copy link
Contributor Author

jed326 commented Dec 12, 2024

@navneet1v added KNN9120PerFieldKnnVectorsFormat, etc in the latest revision. Thanks!

navneet1v
navneet1v previously approved these changes Dec 12, 2024
Copy link
Collaborator

@navneet1v navneet1v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me. Just 1 minor comments

@jed326
Copy link
Contributor Author

jed326 commented Dec 13, 2024

@heemin32 @jmazanec15 could one of you help with a second review here? Thanks!

@heemin32
Copy link
Collaborator

LGTM. Could you also add integ test with lucene engine in FilteredSearchBinaryIT and NestedSearchBinaryIT?

@jed326 jed326 force-pushed the binary-vector-lucene branch from 978161b to bd0719f Compare December 14, 2024 00:10
@heemin32 heemin32 requested a review from navneet1v December 14, 2024 00:11
@jed326
Copy link
Contributor Author

jed326 commented Dec 14, 2024

LGTM. Could you also add integ test with lucene engine in FilteredSearchBinaryIT and NestedSearchBinaryIT?

Thanks @heemin32 -- added the same engine parameterization to these two classes as well in the latest revision.

@navneet1v
Copy link
Collaborator

Lets wait for all CIs to complete before we can merge this change.

@jed326 lets complete the benchmarks for Lucene binary vectors, @heemin32 can you please share details on how benchmarks was done for this blog: https://opensearch.org/blog/lower-your-cost-on-opensearch-using-binary-vectors/

and lets kick off the documentation update too for this feature.

@jed326
Copy link
Contributor Author

jed326 commented Dec 14, 2024

Created the docs issue here: opensearch-project/documentation-website#8955

@navneet1v navneet1v merged commit 8005bbf into opensearch-project:main Dec 14, 2024
35 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Dec 14, 2024
Signed-off-by: Jay Deng <[email protected]>
(cherry picked from commit 8005bbf)
@navneet1v navneet1v added the Features Introduces a new unit of functionality that satisfies a requirement label Dec 14, 2024
navneet1v pushed a commit that referenced this pull request Dec 14, 2024
Signed-off-by: Jay Deng <[email protected]>
(cherry picked from commit 8005bbf)

Co-authored-by: Jay Deng <[email protected]>
owenhalpert pushed a commit to owenhalpert/k-NN that referenced this pull request Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Features Introduces a new unit of functionality that satisfies a requirement
Projects
Status: 2.19.0
Development

Successfully merging this pull request may close these issues.

[FEATURE]Binary vector support with Lucene
4 participants