Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support specifying a similarity threshold #81

Merged
merged 10 commits into from
Aug 12, 2024

Conversation

brylie
Copy link
Contributor

@brylie brylie commented Jul 30, 2024

Related to #21

Add support for specifying a similarity threshold in vector search methods.

  • VectorIndex Class: Add similarity_threshold parameter to query, find_similar, and search methods in src/wagtail_vector_index/storage/base.py.
  • PgvectorIndexMixin Class: Update get_similar_documents and aget_similar_documents methods to accept similarity_threshold parameter and filter results based on it in src/wagtail_vector_index/storage/pgvector/provider.py.
  • QdrantIndexMixin Class: Update get_similar_documents method to accept similarity_threshold parameter and filter results based on it in src/wagtail_vector_index/storage/qdrant/provider.py.
  • WeaviateIndexMixin Class: Update get_similar_documents method to accept similarity_threshold parameter and filter results based on it in src/wagtail_vector_index/storage/weaviate/provider.py.
  • NumpyIndexMixin Class: Update get_similar_documents method to accept similarity_threshold parameter and filter results based on it in src/wagtail_vector_index/storage/numpy/provider.py.
  • Documentation: Update docs/vector-indexes.md to include information on the new similarity_threshold parameter and provide examples of its usage.
  • Tests: Add tests for the new similarity_threshold parameter in query, find_similar, and search methods in tests/test_index.py.

For more details, open the Copilot Workspace session.

brylie added 2 commits July 30, 2024 20:49
Related to wagtail#21

Add support for specifying a similarity threshold in vector search methods.

* **VectorIndex Class**: Add `similarity_threshold` parameter to `query`, `find_similar`, and `search` methods in `src/wagtail_vector_index/storage/base.py`.
* **PgvectorIndexMixin Class**: Update `get_similar_documents` and `aget_similar_documents` methods to accept `similarity_threshold` parameter and filter results based on it in `src/wagtail_vector_index/storage/pgvector/provider.py`.
* **QdrantIndexMixin Class**: Update `get_similar_documents` method to accept `similarity_threshold` parameter and filter results based on it in `src/wagtail_vector_index/storage/qdrant/provider.py`.
* **WeaviateIndexMixin Class**: Update `get_similar_documents` method to accept `similarity_threshold` parameter and filter results based on it in `src/wagtail_vector_index/storage/weaviate/provider.py`.
* **NumpyIndexMixin Class**: Update `get_similar_documents` method to accept `similarity_threshold` parameter and filter results based on it in `src/wagtail_vector_index/storage/numpy/provider.py`.
* **Documentation**: Update `docs/vector-indexes.md` to include information on the new `similarity_threshold` parameter and provide examples of its usage.
* **Tests**: Add tests for the new `similarity_threshold` parameter in `query`, `find_similar`, and `search` methods in `tests/test_index.py`.

---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/wagtail/wagtail-vector-index/issues/21?shareId=XXXX-XXXX-XXXX-XXXX).
@brylie brylie marked this pull request as draft July 30, 2024 18:05
@brylie
Copy link
Contributor Author

brylie commented Jul 30, 2024

I've marked this as a draft to indicate that it is work-in-progress, mainly due to failing tests. Early feedback is more than welcome :-)

@brylie brylie marked this pull request as ready for review July 30, 2024 23:23
@tomusher
Copy link
Member

tomusher commented Aug 5, 2024

This is looking great thanks @brylie! I'll do a more thorough review this week and we get this in.

Copy link
Member

@tomusher tomusher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, appreciate all the effort that went in to documentation. Left a couple of minor comments.

tests/test_index.py Outdated Show resolved Hide resolved
tests/test_index.py Outdated Show resolved Hide resolved
@brylie brylie requested a review from tomusher August 9, 2024 19:23
@tomusher tomusher merged commit cd015ec into wagtail:main Aug 12, 2024
7 checks passed
@tomusher
Copy link
Member

Thanks @brylie!

@brylie brylie deleted the add-similarity-threshold branch August 12, 2024 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants