Missing text after redaction #867
Answered
by
JorjMcKie
NoraishaYusuf
asked this question in
Looking for help
-
Beta Was this translation helpful? Give feedback.
Answered by
JorjMcKie
Jan 25, 2021
Replies: 1 comment 2 replies
-
Problem 1:Hard to tell without looking at that file (probably confidential anyway). But there arethings like damaged PDFs ...
Problem 2:MuPDF normally uses the full font-defined line height when identifying the hits of search. If the PDF is made with smaller distances between lines, then adjacent lines may overlap somewhat. The redaction logic of MuPDF in turn removes every character overlaping the redaction rectangle - the result of this is what you saw.
|
Beta Was this translation helpful? Give feedback.
2 replies
Answer selected by
NoraishaYusuf
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Problem 1:
Hard to tell without looking at that file (probably confidential anyway). But there arethings like damaged PDFs ...
You could try cleaning the file / the page before processing to reveal / remove any errors.
mutool clean -gggsc file.pdf
page.clean_contents(sanitize=True)
Problem 2:
MuPDF normally uses the full font-defined line height when identifying the hits of search. If the PDF is made with smaller distances between lines, then adjacent lines may overlap somewhat. The redaction logic of MuPDF in turn removes every character overlaping the redaction rectangle - the result of this is what you saw.