Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for image embeddings #1753

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

collindutter
Copy link
Member

@collindutter collindutter commented Feb 20, 2025

Describe your changes

Sorry for big-ish PR, lots of overlapping changes here:

  1. Add support for embedding images in Embeddings Drivers. Also refactors Embedding Drivers to have a single embed method rather than type-specific methods (embed_string, embed_image, etc).
    a. Add support for embedding images in Amazon Bedrock Titan Embedding Driver.
    b. Add support for embedding images in VoyageAi Embedding Driver.
  2. Add support for querying/upserting with Vector Store Drivers. Vector Store Driver must be configured with a multi-modal Embedding Driver. Also refactors Vector Store Drivers to add single upsert method rather than type-specific methods (upsert_text, upsert_text_artifact, etc).
    a. Add support to Local Vector Store Driver to save Entrys with Image Artifacts into the persist_file.

Issue ticket number and link

Closes #1729

@collindutter collindutter force-pushed the feature/image-embeddings-vector branch from 08ccde7 to 79e7dcf Compare February 20, 2025 00:28
Copy link

codecov bot commented Feb 20, 2025

Codecov Report

Attention: Patch coverage is 96.15385% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...riptape/drivers/vector/base_vector_store_driver.py 91.30% 0 Missing and 2 partials ⚠️
...riptape/drivers/embedding/base_embedding_driver.py 92.85% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@collindutter collindutter force-pushed the feature/image-embeddings-vector branch 6 times, most recently from 4459f3d to 159a25c Compare February 24, 2025 23:02
@collindutter collindutter self-assigned this Feb 25, 2025
@collindutter collindutter force-pushed the feature/image-embeddings-vector branch 6 times, most recently from 7d925d3 to bc85088 Compare February 28, 2025 22:36
@collindutter collindutter changed the title Feature/image embeddings vector feat: add support for image embeddings Feb 28, 2025
@collindutter collindutter force-pushed the feature/image-embeddings-vector branch from bc85088 to bc5d697 Compare February 28, 2025 22:46
@collindutter collindutter requested review from emjay07 and a team February 28, 2025 22:51
@collindutter collindutter marked this pull request as ready for review February 28, 2025 22:51
- [embed_string()](../../reference/griptape/drivers/embedding/base_embedding_driver.md#griptape.drivers.embedding.base_embedding_driver.BaseEmbeddingDriver.embed_string) for any string.

You can optionally provide a [Tokenizer](../misc/tokenizers.md) via the [tokenizer](../../reference/griptape/drivers/embedding/base_embedding_driver.md#griptape.drivers.embedding.base_embedding_driver.BaseEmbeddingDriver.tokenizer) field to have the Driver automatically chunk the input text to fit into the token limit.
Embeddings in Griptape are multidimensional representations of text and image data. Embeddings carry semantic information, which makes them useful for extracting relevant chunks from large bodies of text for search and querying.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would "text or image data" be better as they are embedding separately from each other?

In the last sentence, we should include "images" somehow so:

which makes them useful for extracting relevant information from large bodies of text or image data for search and querying.

@@ -40,18 +36,80 @@ def to_artifact(self) -> BaseArtifact:

def upsert_text_artifacts(
self,
artifacts: list[TextArtifact] | dict[str, list[TextArtifact]],
artifacts: list[TextArtifact] | dict[str, list[TextArtifact] | list[ImageArtifact]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we want to support ImageArtifact in a method called upsert_text_artifacts? if we do keep it, should the first type also be list[TextArtifact] | list[ImageArtifact]

@overload
def upsert_collection(
self,
artifacts: list[TextArtifact],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we support both text and image here?


def upsert_collection(
self,
artifacts: list[TextArtifact] | dict[str, list[TextArtifact] | list[ImageArtifact]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same commont on first type, or list[ImageArtifact]

self,
artifact: TextArtifact,
value: str | TextArtifact | ImageArtifact,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we are allowing users to input str types, should we also allow bytes if a user has a loaded image already in memory they can upsert that may not be wrapped as an ImageArtifact?

@@ -97,6 +98,8 @@ def query(

Performs a query on the Knowledge Base and returns Artifacts with close vector proximity to the query, optionally filtering to only those that match the provided filter(s).
"""
if isinstance(query, BaseArtifact):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why allow the type, just to raise an error later? what if someone tries to pass a TextArtifact? should the type be ImageArtifact instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Image Embedding Drivers
2 participants