Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't response three traffic lights' colors in one image using your EvoVLM #10

Open
yiyiwwang opened this issue May 30, 2024 · 0 comments

Comments

@yiyiwwang
Copy link

Hello, I tested the image in your paper Table 6 Example 3 with your model EvoVLM-7b, I got the following answer, not the same to your good results with (A)(B)(C) in the paper.
image

image

I have two questions:

  1. The answer seems not recognize the three subimage (A)(B)(C), and do not reply three colors. Why?
  2. Why does the answer repeat the question and response many times?

Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant