Extracted tables are output as text and as table #3971
Unanswered
jicastillow
asked this question in
Looking for help
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello there,
I'm trying PyMuPDF for extracting text and tables from a document. However i've noticed that invoking to_markdown method leads to the following output for this section of the document:
First the raw text extracted contents of the table:
DEL 28 SEP AL 16 DEL 21 AL 30
DIC2024 DIC2024
HOTELES Y MOTONAVES
TPL TPL SGL SGL
DBL DBL
LUJO MOTONAVE: KAHILA/ PLUS JAMILA/ NILE MARQUISE/
ZEINA/ BLUE SHADOW /
(L2+) IBEROTEL CROWN EMPRESS
ROYAL RUBY/CONCERTO I O SIMILAR
HOTEL CAIRO: 1044$ 1491$ 1240$ 1927$ SEMIRAMIS INTERCONTINENTAL / CAIRO MARRIOTT HOTEL/ INTERCONTINENTAL CITY STARS/ HOLIDAY INN MAADI / DUSIT THANI RESORT
And then the correctly formatted table contents:
Is there some way to avoid outputting the raw text part, since I just need the formatted table in MD.
Thanks in advance.
Regards.
Beta Was this translation helpful? Give feedback.
All reactions