You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! Thanks for your work! Since llava has a generation ability, so I want to konw can this model get the image-text fusion feature embeddings in this model only by a image.
Thanks for your time and help!
Best regards!
The text was updated successfully, but these errors were encountered:
Did you mean embedding text from an image, such as OCR?
We have explored similar approaches in our paper, such as rendering captions onto images in Section 4.2. However, we have not attempted to represent both the image and text using a single image.
Hello! Thanks for your work! Since llava has a generation ability, so I want to konw can this model get the image-text fusion feature embeddings in this model only by a image.
Thanks for your time and help!
Best regards!
The text was updated successfully, but these errors were encountered: