Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] 支持Qwen-VL #4375

Closed
marko1616 opened this issue Jun 19, 2024 · 0 comments · May be fixed by #5343 or #4377
Closed

[Feature request] 支持Qwen-VL #4375

marko1616 opened this issue Jun 19, 2024 · 0 comments · May be fixed by #5343 or #4377
Labels
wontfix This will not be worked on

Comments

@marko1616
Copy link
Contributor

[Feature request]支持Qwen-VL

相关issue

Qwen-VL支持的必要性

  • 在大部分图像任务中Qwen-VL的能力超越了LLAVA与YI-VL(Qwen数据集一直都更好)
  • 一个更通用的多模态支持?(GLM4的支持类似这个)
  • 可以嵌入多张图像

实现相关

  • 我已经写了个小 测试 (用于webui的chat)

图像嵌入方法

  • 给template添加单独的Role/修改message格式(glm4v就是这么做的)
  • 让template有能力对内嵌入prompt的图像进行parser(一个通用的parser类)

多模态方法选择

  • 将VISION_MODELS的bool类型改为str记录确切的模态嵌入方法

一个例子

  • 原始图像
    Screenshot 2023-11-10 191555
  • Qwen-VL
    屏幕截图 2024-06-19 133741
  • Yi-VL-6B
    屏幕截图 2024-06-19 133612
  • LLaVA1.5
    image
@github-actions github-actions bot added the pending This problem is yet to be addressed label Jun 19, 2024
@hiyouga hiyouga added wontfix This will not be worked on and removed pending This problem is yet to be addressed labels Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
2 participants