Skip to content

saw-leipzig/pyocrd

 
 

Repository files navigation

core

Collection of OCR-related python tools and wrappers from the OCR-D team
https://travis-ci.org/OCR-D/core.svg?branch=master https://circleci.com/gh/OCR-D/core.svg?style=svg https://scrutinizer-ci.com/g/OCR-D/core/badges/build.png?b=master Docker Automated build https://scrutinizer-ci.com/g/OCR-D/core/badges/quality-score.png?b=master

Installation

To bootstrap the tool, you'll need installed (Ubuntu packages):

  • Python (python)
  • pip (python-pip)

To install system-wide:

make deps-ubuntu deps install

To develop, install to a virtualenv

pip install virtualenv
virtualenv --no-site-packages venv
source venv/bin/activate
make deps install

Usage

pyocrd installs a binary ocrd that can be used to invoke the processors directly (ocrd process) or start (development) webservices (ocrd server)

TODO: Update docs here.

Examples:

# List available processors
ocrd process

# Region-segment with tesserocr all files in METS INPUT fileGrp
ocrd process -m /path/to/mets.xml segment-region/tesserocr

# Chain multiple processors
ocrd process -m /path/to/mets.xml characterize/exif segment-line/tesserocr recognize/tesserocr

# Start a processor web service at port 6543
ocrd server process -p 6543
http PUT localhost:6543/characterize url==http://server/path/to/mets.xml

Testing

Download assets (make assets)

Test with local files: make test

Test with local asset server:
  • Start asset-server: make asset-server
  • make test OCRD_BASEURL='http://localhost:5001/'
Test with remote assets:
  • make test OCRD_BASEURL='https://github.com/OCR-D/assets/raw/master/data/'

See Also

About

Collection of OCR-related python tools and wrappers from @OCR-D

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 97.5%
  • Shell 1.7%
  • Other 0.8%