You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When uploading .doc or .docx files, the following warnings are displayed: No acceptable contours found Contour is not a quadrilateral lib/python3.10/site-packages/langchain_core/_api/deprecation.py:139: LangChainDeprecationWarning: Since Chroma 0.4.x the manual persistence method is no longer supported as docs are automatically persisted. warn_deprecated(
After uploading, a single .doc file splits into around 59 documents in .png or .jpeg formats. These are shown in the Doc Counts sidebar and also form the metadata as illustrated in the attached screenshot.
The main issue is the inability of the .doc file to be parsed as a single document, unlike .pdf files. Instead, it splits into multiple .png or .jpeg files, leading to hallucination during querying.
Please address the parsing issue and provide a solution to handle .doc or .docx files correctly without splitting into multiple image files.
The text was updated successfully, but these errors were encountered:
When uploading .doc or .docx files, the following warnings are displayed:
No acceptable contours found
Contour is not a quadrilateral
lib/python3.10/site-packages/langchain_core/_api/deprecation.py:139: LangChainDeprecationWarning: Since Chroma 0.4.x the manual persistence method is no longer supported as docs are automatically persisted. warn_deprecated(
After uploading, a single .doc file splits into around 59 documents in .png or .jpeg formats. These are shown in the Doc Counts sidebar and also form the metadata as illustrated in the attached screenshot.
The main issue is the inability of the .doc file to be parsed as a single document, unlike .pdf files. Instead, it splits into multiple .png or .jpeg files, leading to hallucination during querying.
Please address the parsing issue and provide a solution to handle .doc or .docx files correctly without splitting into multiple image files.
The text was updated successfully, but these errors were encountered: