You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 15, 2024. It is now read-only.
I have a book(pdf format) maybe 3 chapters, I want to use pdfminer(other tools is ok as long as the tool can do that) to parse the book, so I can extract every chapter from the book, and save them as chapter one.txt、chapter two.txt、chapter three.txt.
How can I do that?
thanks.
The text was updated successfully, but these errors were encountered:
I need it too...
For now I found how to extract titles, easy with the get_outlines() function
But I am currently thinking about how to now extract the text that is contained between two titles... Maybe by investigating in the code of that get_outlines() function?
ghmo2789
added a commit
to kakann/pdfminer
that referenced
this issue
Sep 19, 2022
I have a book(pdf format) maybe 3 chapters, I want to use
pdfminer
(other tools is ok as long as the tool can do that) to parse the book, so I can extract every chapter from the book, and save them aschapter one.txt
、chapter two.txt
、chapter three.txt
.How can I do that?
thanks.
The text was updated successfully, but these errors were encountered: