Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The results of the model evaluation reproduction are incorrect. 模型评测复现的结果不对 #982

Open
Aslan-yulong opened this issue Mar 21, 2025 · 0 comments

Comments

@Aslan-yulong
Copy link

The results of the model evaluation reproduction are incorrect. I used the evaluation set OCRBenchV2 to evaluate the qwen2.5vl series and found that the scores are different from those presented in your paper and on the homepage. The scores I obtained are not as high as yours, and the difference is quite significant.
I conducted the evaluation using the default parameters after deploying the model on the vllm server. May I ask what your evaluation environment is? Why is there such a large discrepancy in the scores?
模型评测复现的结果不对,我使用评测集OCRBenchV2对qwen2.5vl系列进行了评测,发现结果与您在paper和主页展示的分数不同,没有您这里得到的分数高,差别相当大。
我是使用vllm sever部署模型之后按默认参数进行评测的,请问您这边的评测环境是什么?为什么分数差异这么大呢?

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant