Text being incorrectly parsed in table #1149
enrac5
started this conversation in
Ask for help with specific PDFs
Replies: 1 comment 13 replies
-
Hi @enrac5, it looks like the PDF has two print(page..extract_text(layout=True)) Produces:
If you use print(page.dedupe_chars().extract_text(layout=True))
|
Beta Was this translation helpful? Give feedback.
13 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have a very odd issue with the attached file:
test.pdf
Basically, the text in the second column is being read incorrectly and I'm not sure why. This is basically what I'm doing:
The output should be
13+13
but I get113+13
. This is just a small part of a 78 page document (a PDF printed from MS-Word). (Puts on Leia's clothes, "Help me @jsvine, you're my only hope")Beta Was this translation helpful? Give feedback.
All reactions