Skip to content
This repository has been archived by the owner on Mar 12, 2022. It is now read-only.

Django OCR

dalarm edited this page Sep 26, 2017 · 7 revisions

At the moment, the OCR program we have currently doesn't run properly. There's an issue with the Gradle build.

So until that's fixed, I made an extension to the Django app we are currently using.
It uses Tesseract so it should have the same capabilities as our other program.



Installation

Tesserocr requires a fairly recent versions of tesseract-ocr and leptonica. On Ubuntu these can be installed with:

$ apt install tesseract-ocr libtesseract-dev libleptonica-dev

Depending on your environment, you might have to install these packages from the source code. Follow their respective documentations on instructions on how to do it. First, you need to

$ git clone https://github.com/abarto/ocr-with-django.git

Next, you have to install the project's requirements:

(venv) $ pip3 install Cython==0.24.1
(venv) $ pip3 install -r ocr_with_django/requirements.txt

and run the necessary steps to set-up the Django site:

(venv) $ cd ocr_with_django/
(venv) $ python manage.py migrate
(venv) $ python manage.py collectstatic --noinput

You can go here to look at the original.

Access the OCR App

The OCR App can be accessed by going to http://ec2-54-173-153-28.compute-1.amazonaws.com:3000/imageocr/

The left box will show the photo that you decided to use. After you press go, the right box will display the result that the Tesseract OCR Api gave. This is only a snippet of what Tesseract can do. We'll have to mess around with it and eventually extract the data that the API gives us. This documentation shows examples of some of the functions that we can use.

Additional Information

Django App Documentation

Appearance

You can modify the appearance of the website in
CC-ing/image-uploader/django_rest_imageupload_backend/documents/templates/documents/
Modify the html file, ocr_form.html

You can also go to CC-ing/image-uploader/django_rest_imageupload_backend/assets/css/
Inside there, you can modify ocr_form.css

Tesseract API

You can mess with the output of the API by going to
CC-ing/image-uploader/django_rest_imageupload_backend/documents/
Open up views.py and implement what functions you want to use on the image inside class OcrView(View)

OCRView

The OCR process is done in the OcrView

# documents/views.py

class OcrView(View):
    def post(self, request, *args, **kwargs):
        with PyTessBaseAPI() as api:
            with Image.open(request.FILES['image']) as image:
                sharpened_image = image.filter(ImageFilter.SHARPEN)
                api.SetImage(sharpened_image)
                utf8_text = api.GetUTF8Text()

        return JsonResponse({'utf8_text': utf8_text})

We take the uploaded image, process it using a Pillow filter, and pass along the result to the Tesseract OCR API through tesserocr.

We tried to keep the view as simple as possible (no Form, no validation) to focus only on the OCR processes. If you read PyTessBaseAPI docstrings you'll see that there are tons of things you can do with the image and recognition result.