[FEATURE] Add a Node/Document Loader for Image Embedding Using Multimodal LLMs #3549

luminaes · 2024-11-21T19:41:21Z

Describe the feature you'd like:
I would like to request the addition of a node or document loader in Flowise that enables the embedding of images using a multimodal large language model (LLM). This feature would allow users to process and embed images alongside textual data, expanding the capabilities of the platform for handling multimodal datasets.

Use Cases:

Enhancing search and retrieval functionality by indexing and querying images with semantic embeddings.
Enabling richer applications that combine image and text data, such as multimedia question-answering or document analysis.
Supporting workflows where visual information is a key component, such as analyzing diagrams, infographics, or scanned documents.
Additional context:
It would be helpful to have support for commonly used image embedding models like CLIP, BLIP, amazon.titan-embed-image-v1 or similar state-of-the-art multimodal LLMs. Integration with existing workflows in Flowise, such as embedding chaining or retrieval augmentation, would be a key aspect of this feature.

I’m happy to provide further details or examples if needed!

thanks.

HenryHengZJ added the enhancement New feature or request label Nov 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Add a Node/Document Loader for Image Embedding Using Multimodal LLMs #3549

[FEATURE] Add a Node/Document Loader for Image Embedding Using Multimodal LLMs #3549

luminaes commented Nov 21, 2024

[FEATURE] Add a Node/Document Loader for Image Embedding Using Multimodal LLMs #3549

[FEATURE] Add a Node/Document Loader for Image Embedding Using Multimodal LLMs #3549

Comments

luminaes commented Nov 21, 2024