You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
首先感谢CogVLM的作者的贡献,这是一个非常棒的工作!
我在使用CogAgent进行推理的时候,发现一个bug。
我输入的图片如下:
给的query是“Describe the image in detail”。
由于CogAgent的效果很好,模型能识别到图中很微小的文字并描述出来。但是当图中的文字有重复的时候(例如我给的例子中,图片下方有一段重复的NATURE),模型就会一直重复输出NATURE,一直到max_length被截断。输出如下:
The image showcases a close-up view of a leaf, possibly from a plant, with a prominent vein structure. The leaf is illuminated from the top, casting a soft glow on its surface. The background is dark, emphasizing the leaf's vibrant green color. There's also a white line dividing the image, and below it, there's a white leaf-like icon. At the bottom, there's text that reads 'NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NATURE NAT
模型能够给出完整的输出,应该不是程序上的错误,而是对图像理解上存在某种未知的bug。
这种并不是个例,我在测试的时候发现了好几次。不知道作者或者其他使用者有没有遇到过类似的bug,产生这种bug的原因是什么呢?
Beta Was this translation helpful? Give feedback.
All reactions