Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add a Node/Document Loader for Image Embedding Using Multimodal LLMs #3549

Open
luminaes opened this issue Nov 21, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@luminaes
Copy link

Describe the feature you'd like:
I would like to request the addition of a node or document loader in Flowise that enables the embedding of images using a multimodal large language model (LLM). This feature would allow users to process and embed images alongside textual data, expanding the capabilities of the platform for handling multimodal datasets.

Use Cases:

Enhancing search and retrieval functionality by indexing and querying images with semantic embeddings.
Enabling richer applications that combine image and text data, such as multimedia question-answering or document analysis.
Supporting workflows where visual information is a key component, such as analyzing diagrams, infographics, or scanned documents.
Additional context:
It would be helpful to have support for commonly used image embedding models like CLIP, BLIP, amazon.titan-embed-image-v1 or similar state-of-the-art multimodal LLMs. Integration with existing workflows in Flowise, such as embedding chaining or retrieval augmentation, would be a key aspect of this feature.

I’m happy to provide further details or examples if needed!

thanks.

@HenryHengZJ HenryHengZJ added the enhancement New feature or request label Nov 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants