You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Yeah, my rewrite of the document stream parsing code dropped this config variable off the table. The unit tests just test that it returns the value properly instead of actually testing it against document text, so my changes sailed through without errors.
One place where this config value definitely could be inserted back is in Font.php near the bottom of the decodeText() function:
// Cut down on the number of unnecessary internal spaces by// imploding the string on the null byte, and checking if the// text includes extra spaces on either side. If so, merge// where appropriate.$words = implode("\x00\x00", $words);
$hOffset = $this->config->getHorizontalOffset();
$words = str_replace(
["\x00\x00", "\x00\x00", "\x00\x00", "\x00\x00"],
[''.$hOffset.'', $hOffset.'', ''.$hOffset, $hOffset],
$words
);
... but this is probably not going to affect as many places in the generated text as the previous algorithm did. If you can check whether inserting this code solves your particular issue @luigif, we could add this back in as at least a partial fix.
Note: I'm not sure the above is the final fix; I'll have to run it on more test documents.
The patch in Fonts.php does not solve my problem.
With previous library versions I was able to fix issues in tables with some HorizontalOffset tweaking.
The config property HorizontalOffset, that was useful in dealing with format issues (https://github.com/smalot/pdfparser/blob/v2.11.0/doc/CustomConfig.md), is not checked anymore.
It can be set as described in the docs, but it's useless.
The last version checking and using its value was 2.7.0, any later version ignores its settings.
The text was updated successfully, but these errors were encountered: