-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question about the image understanding #25
Comments
Hi, thanks for your interest! [
{
"type": "image",
"content": "image1.png"
},
{
"type": "image",
"content": "image2.png"
},
{
"type": "text",
"content": "your instruction"
}
] And it's important to note that the performance of Anole depends on the multiple image input task, and Anole may perform differently on different tasks. |
Thank you for your reply! But I have a problem when inputting multiple images: when the number of input images is greater than or equal to four, the following error will occur: Traceback (most recent call last): I output the shape of tokens and found that the result is torch.Size([0]). What is the reason for this? |
Probably because the default Anole context length is 4096 and the number of tokens per image is 1026 (1024 + boi + eoi), which makes the model not work properly when the number of input images is greater than or equal to 4. |
Is the number of tokens per image a parameter that user can set or is it fixed? |
I'm sorry it's fixed. |
I have another question. When I use the model for batch image understanding, the output is empty. |
Does this model support multiple image inputs?
The text was updated successfully, but these errors were encountered: