-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance (Qdrant) hybrid/sparse search support #15483
base: main
Are you sure you want to change the base?
Enhance (Qdrant) hybrid/sparse search support #15483
Conversation
f205b0f
to
5f87d89
Compare
5f87d89
to
3e83297
Compare
I rebased onto main and noticed that a previous commit (78f9a11) introduced somewhat breaking changes with the qdrant vector_store which I resolved in 7a7f684. Why I consider the changes breaking: Is there anything I can do to help integrate this PR within a foreseable timespan? Or are there specific reasons that keep you from doing so? Please let me know if that is the case. I'm happy to continue working on it if there is something missing! |
Hi @logan-markewich, I think you reviewed my last PR on llama-index, could you maybe take a look at this and let me know what you think? I believe that, aside from the enhancement of code quality, the possibility to avoid duplicate computation really adds some value! If you have other concerns, please do share them! Thanks and kind regards, |
150e219
to
6355a32
Compare
6355a32
to
d038e05
Compare
I'm not sure why the tests here fail after rebasing. It seems that some dependencies are not available but the failing tests are actually not relevant to the changes in this PR. |
59dbadb
to
895139f
Compare
23c6f87
to
008732b
Compare
008732b
to
b0173c9
Compare
Hi @logan-markewich, this PR keeps growing :D I finally found some time to write unit tests. In the process I also had to fix unit tests for the docling reader and node_parser which failed due to the newly introduced If you, for any reason, don't plan to merge this PR then please let me know, so that I can stop rebasing it over and over again :) |
cd2bdb2
to
f687386
Compare
3a7eaca
to
45943f2
Compare
…ry_fn will be set)
…e embeddings directly to the retriever
The `_sparse_doc_fn` and `_sparse_query_fn` are set in `__init__` IF `enable_hybrid==True`.
Given that the `_sparse_doc_fn` does not return empty lists, we can safely remove the check. It might in fact be safer to let it crash here if the output of the `_sparse_doc_fn` is of an unexpected format.
…uery} # Conflicts: # llama-index-integrations/vector_stores/llama-index-vector-stores-qdrant/tests/test_vector_stores_qdrant.py
45943f2
to
2162769
Compare
Description
This PR gives more explicit control over sparse_embeddings to the developer of RAG applications. The motivation is provided in this issue and the PR directly adapts the
QdrantVectorStore
. OtherVectorStore
s that support hybrid retrieval will need to be adapted.This PR aims at enhancing the useablility of the Qdrant hybrid search functionality by adding sparse embedding fields to both the
VectorStoreQuery
andQueryBundle
and using the provided sparse embedding rather than recomputing it.Additionally the
sparse_embedding
field is added to theBaseNode
class and if set, used by theQdrantVectorStore.add
method rather than recomputing it.Along with the described changes, a larger refactoring reduces code duplication for the
QdrantVectorStore.{query, aquery}
and enhances readability. Theadd
method got a minor rework, too.Use Case
If working with (a subclass of) the QueryFusionRetriever and multipe retrievers (that use the same embedding models), the query embedding would be calculated more than once per embedding string. Caching won't help as the embeddings are generally calculated asynchronously. To avoid the resource overhead of duplicate computation, passing the precompted embeddings was possible for dense embeddings already.
Implementation
The commits within this PR should be self-explanatory and well-structured. The main changes are:
self._{a}client.search_batch
also in the default case to enable building the requests that shall be sent step-by-step and thus reduce the number of casesawait self.asparse_vector_name()
by precomputing it in__init__
hybrid_top_k
from theVectorIndexRetriever
kwargs to theQueryBundle
(as done with_sparse_top_k
)New Package?
Did I fill in the
tool.llamahub
section in thepyproject.toml
and provide a detailed README.md for my new integration or package?Version Bump?
Did I bump the version in the
pyproject.toml
file of the package I am updating? (Except for thellama-index-core
package)Bumping the versions of
llama-index-core
llama-index-vector-stores-qdrant
llama-index-node-parser-docling
llama-index-readers-docling
remains in TODO for now (see below).
Type of Change
In principle, this is a change to the QdrantVectorStore but as the additional fields in
QueryBundle
andVectorStoreQuery
are required, a parallel version bump of llama-index-core (and specification in the requirements) will be necessary. The type of changes made are:Backwards compatibility could be achieved easily by checking if the provided
QueryBundle
has the necessary field but I am unsure if this is the best approach.How Has This Been Tested?
I added a test that directly ensures that the getter function
_get_sparse_embedding_query
reuses the provided sparse embedding.I furthermore implemented unit tests for all 6 variants of
{query, aquery} x {dense, sparse, hybrid}
by setting the sparse and dense embeddings in a QdrantVectorStore withenable_hybrid=True
and running queries with identically set(sparse_)embeddings
Suggested Checklist:
make format; make lint
to appease the lint gods