You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the feature you'd like:
I would like to request the addition of a node or document loader in Flowise that enables the embedding of images using a multimodal large language model (LLM). This feature would allow users to process and embed images alongside textual data, expanding the capabilities of the platform for handling multimodal datasets.
Use Cases:
Enhancing search and retrieval functionality by indexing and querying images with semantic embeddings.
Enabling richer applications that combine image and text data, such as multimedia question-answering or document analysis.
Supporting workflows where visual information is a key component, such as analyzing diagrams, infographics, or scanned documents. Additional context:
It would be helpful to have support for commonly used image embedding models like CLIP, BLIP, amazon.titan-embed-image-v1 or similar state-of-the-art multimodal LLMs. Integration with existing workflows in Flowise, such as embedding chaining or retrieval augmentation, would be a key aspect of this feature.
I’m happy to provide further details or examples if needed!
thanks.
The text was updated successfully, but these errors were encountered:
Describe the feature you'd like:
I would like to request the addition of a node or document loader in Flowise that enables the embedding of images using a multimodal large language model (LLM). This feature would allow users to process and embed images alongside textual data, expanding the capabilities of the platform for handling multimodal datasets.
Use Cases:
Enhancing search and retrieval functionality by indexing and querying images with semantic embeddings.
Enabling richer applications that combine image and text data, such as multimedia question-answering or document analysis.
Supporting workflows where visual information is a key component, such as analyzing diagrams, infographics, or scanned documents.
Additional context:
It would be helpful to have support for commonly used image embedding models like CLIP, BLIP, amazon.titan-embed-image-v1 or similar state-of-the-art multimodal LLMs. Integration with existing workflows in Flowise, such as embedding chaining or retrieval augmentation, would be a key aspect of this feature.
I’m happy to provide further details or examples if needed!
thanks.
The text was updated successfully, but these errors were encountered: