-
-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Request] 好像不支持 Ollama 的 llama3.2-vision 图片 #4642
Comments
Thank you for raising an issue. We will investigate into the matter and get back to you as soon as possible. |
|
Is there any setting for |
{
"messages": [
{
"content": "图片中有什么?\n\n\n<files_info>\n<images>\n<images_docstring>here are user upload images you can refer to</images_docstring>\n<image name=\"截屏2024-05-22 17.52.16.png\" url=\"https://s3-lobechat.tabun.pro/files/480846/875b55ec-923a-48ce-b1e3-0730c4a92794.png\"></image>\n</images>\n\n</files_info>",
"role": "user"
}
],
"model": "llama3.2-vision",
"options": {
"frequency_penalty": 0,
"presence_penalty": 0,
"temperature": 0.35,
"top_p": 1
},
"stream": true
} 设置了环境变量 LLM_VISION_IMAGE_USE_BASE64=1 也还是一样 |
{
"messages": [
{
"content": "What's in the picture?\n\n\n<files_info>\n<images>\n<images_docstring>here are user upload images you can refer to</images_docstring>\n<image name=\" Screenshot 2024-05-22 17.52.16.png\" url=\"https://s3-lobechat.tabun.pro/files/480846/875b55ec-923a-48ce-b1e3-0730c4a92794.png\"></image>\n</images>\n\n</ files_info>",
"role": "user"
}
],
"model": "llama3.2-vision",
"options": {
"frequency_penalty": 0,
"presence_penalty": 0,
"temperature": 0.35,
"top_p": 1
},
"stream": true
} Setting the environment variable LLM_VISION_IMAGE_USE_BASE64=1 still the same |
看这个 #3888 好像还没实现Ollama url 转 base64,不过有的Ollama模型又可以识别图片 This appears to be an XML (Extensible Markup Language) file that contains information about a single image. Here's a breakdown of the contents:
In summary, this XML snippet describes a single image file with its filename and URL. The context appears to be a web application or API that handles file uploads, as hinted by the |
🥰 需求描述
Ollama 0.4.0 支持了 llama3.2-vision 模型,可以识别图片。https://ollama.com/blog/llama3.2-vision
目前尝试了在 LobeChat v1.28.4 中调用了 llama3.2-vision 模型,发现不能正确处理图片。
从日志可以看到相关请求体:
看起来是把图片放在 content 中了,ollama 的 llama3.2-vision 模型的支持方式可能不同。
希望能够支持一下,谢谢🙏!
🧐 解决方案
从 ollama 的文档看,应该是类似以下的格式:
既图片需要 base64 encode 之后放在 images 中,而且是一个数组。
📝 补充信息
No response
The text was updated successfully, but these errors were encountered: