Get info for checkboxes? #738
Replies: 6 comments 4 replies
-
Hi @Rapid1898-code , appreciate your interest in the library. In the case of the PDF, that you shared, to get all the checkboxes, you can filter out the rect objects by # Getting all the rect objects that are of square shape and in the range of 10-15 units per side and having no stroking color
checkboxes_all = [rect for rect in rects if 10 < rect["width"] < 15 and 10 < rect["height"] < 15 and int(rect["width"]) == int(rect["height"]) and rect['non_stroking_color'] == None] Now, to get all the checkboxes that are filled (aka checked), you can filter out by checkboxes_checked = []
# If a checkbox has a diagonal line within it, consider it checked.
for checkbox in checkboxes_all:
# Crop the page to contain just the checkbox rectangle.
cropped = page.crop((checkbox["x0"], checkbox["top"], checkbox["x1"], checkbox["bottom"]))
# If the cropped page has 6 edges, then consider it checked. Unchecked ones would have just 4 lines.
if len(cropped.edges) == 6:
checkboxes_checked.append(checkbox) You would need to do further post processing to know the text that is besides these checkboxes but I hope the logic would put you in the right direction. |
Beta Was this translation helpful? Give feedback.
-
Hello - thanks a lot for your response - Should i get first the checkboxes as you described it above. Or is there some better / easier way to do this? KR, Max |
Beta Was this translation helpful? Give feedback.
-
ok thanks again a lot - may i bother you with a similar question? |
Beta Was this translation helpful? Give feedback.
-
Thanks a lot for your help again - this works great. |
Beta Was this translation helpful? Give feedback.
-
Thanks that worked great for the radio-buttons.
It generally works - but as you can see in the attached picture only the diagonal line from left-top to right-bottom is red. |
Beta Was this translation helpful? Give feedback.
-
Thanks - works great! |
Beta Was this translation helpful? Give feedback.
-
Hello, i am able to extract the whole text and also the tables from the pdf what is great -
But is it also possible to get the information for the checkboxes if they are checked or not?
If yes how can i get this information in the attached pdf?
input.pdf
Beta Was this translation helpful? Give feedback.
All reactions