-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poor Image Recognition Capabilities #3
Comments
Hello, thanks for your feedback. We have tested some similar cases. For the second case, the background is all black, which is not appeared in our training data (our training data contains natural scenes), thus the model thinks that this image has some issues of brightness. The confusing thing is that the model claims brightening instead of darkening. Can you share this image to me through [email protected]? For the first and third case, we think the distortion judgement is reasonable (i.e., the saturation in the first image is not so good, and the color in the third image is great). But our model is not trained on cartoon images, so high-level recognition may be a little hard. We will also consider add cartoon images in our next release. |
Sure, I'll be mailing these to you. The issue can be explained by the fact that similar images were not part of the training data. Is this also the reason for poor text recognition? For example, in both the first and second cases, the model doesn't seem to recognize the text elements and consistently mistakes them for something else. If so, then including such images in the training data will also enhance the model's performance on images with text elements. |
Yes. You are right. Our training data currently focuses on natural scenes, and does not contain text contents. In this release, if we find that text is important for many users, we will construct corresponding datasets to solve this problem in next release, which is scheduled in around Sept / Oct. |
We have tested the second case with a screenshot on your image. The input question is the same. However, the response is different as follows. The response does not contain
|
I have mailed you the images. I didn't tweak any parameters, and directly used the gradio app and prompt shown in above image for testing. |
Hello Team,
Thanks for releasing the weights. I tested the model with some of these examples but the quality seems to be very bad and nowhere near the quoted examples in the paper. Is this level of performance expected from the latest weights release or am I doing something wrong here?
The text was updated successfully, but these errors were encountered: