Skip to content

How to extract page as image and reserve the original resolution? #1305

Answered by JorjMcKie
void285 asked this question in Q&A
Discussion options

You must be logged in to vote

First of all, let me move this issue to Discussions - which seems more adequate.

You need not take integers as zoom values: floats are allowed. So you definitely can find a value that suits your need.
If you look at page.get_images() (list of images defined for the page), you should see 1 (maybe 2, depends on the scanner) item representing the scanned page.
You will also see image width and height there. Plus colorspace, which helps determine the adequate value for pixmap creation.

If you see only one image in said list, you do not need to make an extra pixmap of the page for your OCR engine.
Instead, just extract that image and hand over its binary representation. E.g.

>>> from pprint im…

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@void285
Comment options

Answer selected by void285
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #1304 on October 01, 2021 12:40.