Replies: 1 comment
-
Hi @dmlls, and thanks for your interest in this library. I'm not very familiar with MyMuPDF's get-blocks method, but adding something like that to blocks = re.split(r"\n\n+", my_page.extract_text(layout=True, ...)) ... but multi-column layouts would require a more sophisticated approach. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi everyone 👋🏻
I have been going through the documentation, issues and discussions but I haven't found an answer to the following question:
Does pdfplumber provide any way of extracting text as blocks? Something along the lines of
Page.get_text("blocks")
in PyMuPDF.(Sorry in advance if this question has already been answered elsewhere.)
Beta Was this translation helpful? Give feedback.
All reactions