-
Hello, My question is quite simple but I didn't find a way to answer it : let's imagine that we have two rectangles on our pdf page, they are practically identical in size and position but one if black and the other is white, is there a way to know which of those rectangles is "on top" of the other ? This is useful and has many applications like discarding unnecessary information or accurately reading forms with tickboxes I think. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
I believe that the PDF Reference specifies that objects are drawn on top of one another, in the order that the document creates them. And I believe that For two objects of the same |
Beta Was this translation helpful? Give feedback.
I believe that the PDF Reference specifies that objects are drawn on top of one another, in the order that the document creates them. And I believe that
pdfminer.six
extracts objects in that same order. So for two rectangles, the "top" one would be the rectangle that appears later inpage.rects
than the other. Does that seem to be the case for your PDF?For two objects of the same
pdfminer.six
type (e.g., two rects, two lines, two chars), the comparison should be easy. But if you're comparing across types (e.g., a rect vs. a line), you'll want to run something likeordered_objects = list(page.iter_layout_objects(page.layout._objs))
and then do the comparison based on that list.