Skip to content

Full page image in digitized PDF #994

Discussion options

You must be logged in to vote

Scanned image pages usually have one (or more) images which cover the complete page - btw. not necessarily always being exactly the page size.
There is a Page method that computes the rectangle an image covers on the page, the "bbox" (boundary box). You can iterate through a page's images and check whether page.rect in bbox. Snippet:

img_list = [img for img in page.get_images(True) if img[-1] == 0]  # only consider images directly called by page
for img in img_list:
    bbox = page.get_image_bbox(img)
    if page.rect in bbox:
        # fully covering image detected!
        break

Comments:
get_images(True) creates an extended image list, which also checks whether it is the page itself, t…

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@chest3x
Comment options

@JorjMcKie
Comment options

@JorjMcKie
Comment options

Answer selected by chest3x
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants