-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow processing of rgb images #8
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -187,45 +187,47 @@ class id to a new PAGE region type (and subtype). | |
else: | ||
zoomed = 1.0 | ||
|
||
# for morphological post-processing, we will need the binarized image, too | ||
page_image_bin, _, _ = self.workspace.image_from_page( | ||
page, page_id, | ||
feature_selector='binarized') | ||
# workaround for OCR-D/core#687: | ||
if 0 < abs(page_image_raw.width - page_image_bin.width) <= 2: | ||
diff = page_image_raw.width - page_image_bin.width | ||
if diff > 0: | ||
page_image_raw = crop_image( | ||
page_image_raw, | ||
(int(np.floor(diff / 2)), 0, | ||
page_image_raw.width - int(np.ceil(diff / 2)), | ||
page_image_raw.height)) | ||
else: | ||
page_image_bin = crop_image( | ||
page_image_bin, | ||
(int(np.floor(-diff / 2)), 0, | ||
page_image_bin.width - int(np.ceil(-diff / 2)), | ||
page_image_bin.height)) | ||
if 0 < abs(page_image_raw.height - page_image_bin.height) <= 2: | ||
diff = page_image_raw.height - page_image_bin.height | ||
if diff > 0: | ||
page_image_raw = crop_image( | ||
page_image_raw, | ||
(0, int(np.floor(diff / 2)), | ||
page_image_raw.width, | ||
page_image_raw.height - int(np.ceil(diff / 2)))) | ||
else: | ||
page_image_bin = crop_image( | ||
page_image_bin, | ||
(0, int(np.floor(-diff / 2)), | ||
page_image_bin.width, | ||
page_image_bin.height - int(np.ceil(-diff / 2)))) | ||
# check wether input image is binarized | ||
if page_image_info.photometricInterpretation == "1": | ||
# for morphological post-processing, we will need the binarized image, too | ||
page_image_bin, _, _ = self.workspace.image_from_page( | ||
page, page_id, | ||
feature_selector='binarized') | ||
# workaround for OCR-D/core#687: | ||
if 0 < abs(page_image_raw.width - page_image_bin.width) <= 2: | ||
diff = page_image_raw.width - page_image_bin.width | ||
if diff > 0: | ||
page_image_raw = crop_image( | ||
page_image_raw, | ||
(int(np.floor(diff / 2)), 0, | ||
page_image_raw.width - int(np.ceil(diff / 2)), | ||
page_image_raw.height)) | ||
else: | ||
page_image_bin = crop_image( | ||
page_image_bin, | ||
(int(np.floor(-diff / 2)), 0, | ||
page_image_bin.width - int(np.ceil(-diff / 2)), | ||
page_image_bin.height)) | ||
if 0 < abs(page_image_raw.height - page_image_bin.height) <= 2: | ||
diff = page_image_raw.height - page_image_bin.height | ||
if diff > 0: | ||
page_image_raw = crop_image( | ||
page_image_raw, | ||
(0, int(np.floor(diff / 2)), | ||
page_image_raw.width, | ||
page_image_raw.height - int(np.ceil(diff / 2)))) | ||
else: | ||
page_image_bin = crop_image( | ||
page_image_bin, | ||
(0, int(np.floor(-diff / 2)), | ||
page_image_bin.width, | ||
page_image_bin.height - int(np.ceil(-diff / 2)))) | ||
|
||
# ensure RGB (if raw was merely grayscale) | ||
if page_image_raw.mode == '1': | ||
page_image_raw = page_image_raw.convert('L') | ||
page_image_raw = page_image_raw.convert(mode='RGB') | ||
page_image_bin = page_image_bin.convert(mode='1') | ||
page_image_bin = page_image_raw.convert(mode='1') | ||
Comment on lines
-228
to
+230
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's not going to work well: binarization is usually more than a simple conversion – you have to find the best threshold, ideally localized across the image. |
||
# reduce resolution to 300 DPI max | ||
if zoomed != 1.0: | ||
page_image_bin = page_image_bin.resize( | ||
|
@@ -267,7 +269,7 @@ def _process_page(self, page, ignore, page_coords, page_id, page_array_raw, page | |
#page.set_TextRegion([]) | ||
page.set_custom('coords=%s' % page_coords['transform']) | ||
height, width, _ = page_array_raw.shape | ||
# get connected components to estimate scale | ||
# get connected components to estimate ignorescale | ||
Comment on lines
-270
to
+272
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ? |
||
_, components = cv2.connectedComponents(page_array_bin.astype(np.uint8)) | ||
# estimate glyph scale (roughly) | ||
_, counts = np.unique(components, return_counts=True) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a strange thing to query: it is only defined if the image was a TIFF (Pillow's TIFF plugin), and it does not even discern binarized vs others: binarized would be
.mode == '1'
. This signifiesBlackIsZero
, which could be true forI don't understand the purpose yet: what did go wrong before?