Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

output multiple coordinate points #961

Open
sms-s opened this issue Mar 17, 2025 · 4 comments
Open

output multiple coordinate points #961

sms-s opened this issue Mar 17, 2025 · 4 comments

Comments

@sms-s
Copy link

sms-s commented Mar 17, 2025

How should I prompt qwen-2.5-VL to output multiple coordinate points?
With this prompt: prompt = "point to the rolling pin on the far side of the table, output its coordinates in XML format object", it can only output a single coordinate point.

@xin-li-67
Copy link

Try using the sample codes in the spatial understanding notebook in the cookbooks folder.

@sms-s
Copy link
Author

sms-s commented Mar 19, 2025

I'm following the format of the cookbooks, but it only outputs a single point. I want multiple coordinate points to point to the object.Do you have any other suggestions?

Try using the sample codes in the spatial understanding notebook in the cookbooks folder.

@xin-li-67
Copy link

I'm following the format of the cookbooks, but it only outputs a single point. I want multiple coordinate points to point to the object.Do you have any other suggestions?

Try using the sample codes in the spatial understanding notebook in the cookbooks folder.

Well, I've never tried the points example. For bbox coordinates scenes, I can get multiple returns.

@HumanZhong
Copy link

@sms-s
Hi, you may try this prompt:

Image

If the model still outputs only one point, you can also try:

  1. add "every/one by one" such kind of words into your prompts.
  2. try using JSON format

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants