-
Notifications
You must be signed in to change notification settings - Fork 30
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
49c6097
commit 38f945c
Showing
31 changed files
with
619 additions
and
259 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
import embed_anything | ||
from embed_anything import EmbedData, EmbeddingModel, WhichModel, embed_query | ||
from embed_anything.vectordb import Adapter | ||
import os | ||
from time import time | ||
import numpy as np | ||
import heapq | ||
|
||
|
||
model = EmbeddingModel.from_pretrained_hf( | ||
WhichModel.SparseBert, "prithivida/Splade_PP_en_v1" | ||
) | ||
|
||
sentences = [ | ||
"The cat sits outside", | ||
"A man is playing guitar", | ||
"I love pasta", | ||
"The new movie is awesome", | ||
"The cat plays in the garden", | ||
"A woman watches TV", | ||
"The new movie is so great", | ||
"Do you like pizza?", | ||
] | ||
|
||
embedddings = embed_query(sentences, embeder=model) | ||
|
||
embed_vector = np.array([e.embedding for e in embedddings]) | ||
|
||
similarities = np.matmul(embed_vector, embed_vector.T) | ||
|
||
# get top 5 similarities and show the two sentences and their similarity scores | ||
# Flatten the upper triangle of the similarity matrix, excluding the diagonal | ||
similarity_scores = [ | ||
(similarities[i, j], i, j) | ||
for i in range(len(sentences)) | ||
for j in range(i + 1, len(sentences)) | ||
] | ||
|
||
# Get the top 5 similarity scores | ||
top_5_similarities = heapq.nlargest(5, similarity_scores, key=lambda x: x[0]) | ||
|
||
# Print the top 5 similarities with sentences | ||
for score, i, j in top_5_similarities: | ||
print(f"Score: {score:.2} | {sentences[i]} | {sentences[j]}") | ||
|
||
|
||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -471,3 +471,4 @@ class WhichModel(Enum): | |
Jina = ("Jina",) | ||
Clip = ("Clip",) | ||
Colpali = ("Colpali",) | ||
SparseBert = ("SparseBert",) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1 @@ | ||
pub mod colpali; | ||
pub mod colpali; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.