PdfAlto #13

lfoppiano · 2024-07-01T04:04:55Z

I was wondering if you could add pdfalto in the benchmark: https://github.com/kermitt2/pdfalto

lfoppiano · 2025-01-24T03:59:59Z

Hi all,
I've added pdfalto in my fork: https://github.com/lfoppiano/pdf-extraction-benchmarks

I also removed the post-processing which was fixing ligatures and removing footers, because the plain text extraction should aim to extract all the text without any structure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PdfAlto #13

PdfAlto #13

lfoppiano commented Jul 1, 2024

lfoppiano commented Jan 24, 2025

PdfAlto #13

PdfAlto #13

Comments

lfoppiano commented Jul 1, 2024

lfoppiano commented Jan 24, 2025