-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A lot of memory used on newspaper pages? #110
Comments
Might have overlooked this because a. our servers have a lot of memory and b. I didn't process a lot of newspapers.
|
I don't have the data for the issue mentioned in OCR-D/quiver-benchmarks#22, tried to produce something similar but failed due to an unrelated issue. → Trying with some other data supplied by @cneud |
Yeah, ran into another unrelated issue first: OCR-D/core#1179 |
The page I used only had 365 lines, didn't see anything more than 1.8 GB RSS ("not great, not terrible"). There is something else wrong, though, it seems to use the raw (RGB) images for some lines, this does not make sense. But the XML may be not be 100% as I imported it etc. pp. |
The workspace also showed signs of OCR-D/core#1195, so I'll try again first, with METS caching disabled. |
I've redone the segmentation, no "raw image" problem anymore. Probably just because I couldn't figure out how to fix up the XML so i works properly with the AlternativeImage logic |
In OCR-D/quiver-benchmarks#22, @stweil mentions 118 GB being used for newspaper pages.
The text was updated successfully, but these errors were encountered: