Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When the knowledge base performs segmentation analysis, pictures in the document are lost #11134

Closed
5 tasks done
zlm0001 opened this issue Nov 26, 2024 · 3 comments
Closed
5 tasks done

Comments

@zlm0001
Copy link

zlm0001 commented Nov 26, 2024

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.12.1

Cloud or Self Hosted

Cloud

Steps to reproduce

When the knowledge base performs segmentation analysis, pictures in the document are lost
image
image

✔️ Expected Behavior

When the knowledge base performs segmentation parsing, the pictures in the document can be attached to the context, the same position as the original text displayed. In this way, when performing knowledge retrieval, the pictures can also be retrieved as knowledge.

❌ Actual Behavior

When the knowledge base was performing segmentation analysis, the pictures in the document were lost and were not stored in the knowledge base

Copy link

dosubot bot commented Nov 26, 2024

I found a similar unsolved discussion titled "When processing documents, you can save the images to segments" [1]. This might be related to the issue you're experiencing with images being lost during segmentation analysis.

In the Dify codebase, images in DOCX documents are extracted and processed using the WordExtractor class. This class extracts images, saves them separately, and embeds references in the processed document content [2][3]. However, it seems there might be an issue with how these images are being handled or stored in the knowledge base during segmentation.

Additionally, there is a test named test_invoke_chat_model_with_vision that verifies the handling of images during segmentation analysis, which might be useful for debugging this issue [4].

To continue talking to Dosu, mention @dosu.

@dosubot dosubot bot added cloud When the version is cloud and it is a bug report 🐞 bug Something isn't working labels Nov 26, 2024
@crazywoola crazywoola removed 🐞 bug Something isn't working cloud When the version is cloud and it is a bug report labels Nov 26, 2024
@crazywoola
Copy link
Member

Link #11063, currently, we do not parse images in our system.

@zlm0001
Copy link
Author

zlm0001 commented Nov 27, 2024

[11063](url) is to analyze the content in the picture. I want to not parse the content in the picture, but to follow the picture intact in the context of the original text.

We now do knowledge questions and answers. When users do knowledge questions and answers, they hope that the large model can not only return the text, but also hope to return the pictures related to the knowledge questions and answers.

Example: I created a knowledge base related to the user manual, which contains many description of the operation steps and screenshots to assist instructions. When users ask questions and answers to the user manual knowledge base, the large model not only returns the text of the operation steps, but also returns relevant screenshots

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants