How to find corresponding image from an hocr result? #1

zuphilip · 2017-07-29T07:22:27Z

I have found something interesting here https://digi.bib.uni-mannheim.de/periodika/reichsanzeiger/ocr/film/tesseract-4.0.0-alpha.20170703/012-9419/0580.hocr and would like to see the corresponding image. How can I find it?

stweil · 2017-07-29T07:42:45Z

Get the microfilm number 012-9419 and the image number 0580 from the URL and use it in the viewer URL:

https://digi.bib.uni-mannheim.de/viewer/reichsanzeiger/film/012-9419/0580.jp2 (restricted images)
https://digi.bib.uni-mannheim.de/viewer/reichsanzeiger/scan/012-9419/0580.jp2 (free images)

The correct image link should be offered by the search interface in the future.

stweil · 2017-07-29T07:56:13Z

Maybe the hOCR can be modified on the server side on the fly when it is requested by a web client:

A program could look up metadata in the database (date, issue, page number) and add it to the HTML answer (title tag, time information). Then it could add an image link, maybe also links for other visualisations (like hocrjs). The same program could also do post OCR and fix known OCR errors. That process would preserve the original OCR results, deliver the best post OCR available and preserve disk space.

zuphilip added the question label Jul 29, 2017

stweil added the enhancement label Jul 29, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to find corresponding image from an hocr result? #1

How to find corresponding image from an hocr result? #1

zuphilip commented Jul 29, 2017

stweil commented Jul 29, 2017 •

edited

Loading

stweil commented Jul 29, 2017

How to find corresponding image from an hocr result? #1

How to find corresponding image from an hocr result? #1

Comments

zuphilip commented Jul 29, 2017

stweil commented Jul 29, 2017 • edited Loading

stweil commented Jul 29, 2017

stweil commented Jul 29, 2017 •

edited

Loading