A collection of manuals and documentation for training / fine-tuning various OCR engines that were created by University Library Mannheim during the 3rd phase of the OCR-D project and funded by funded by the German Research Foundation (DFG).
DOC
refers to documentation of various training processes.MAN
refers to step-by-step manuals and user guides.ENG
,DEU
etc. refers to the language of a document.
Manuals and documentation for training the OCR engine Tesseract.
MAN | DEU
: Training mit Tesseract und TesstrainMAN | ENG
: Training with Tesseract and TesstrainMAN | ENG
: Tesseract CLI training with synthetic ground truth dataDOC | ENG
: Tesseract CLI training with ground truth data set Austrian Newspapers
Manuals and documentation for training the OCR engine Kraken.
MAN | ENG
: Step-by-Step Guide: Training with eScriptoriumDOC | ENG
: Kraken CLI training with ground truth data set Austrian Newspapers
Manuals and documentation for training the OCR engine Calamari.