CSR: Character Sequence Recognition from Images

OCR using Deeplearning

Introduction

Recognition of texts from images and videos having numerous applications in different domains like banking, legal, healthcare, safety etc[1]. This technique commonly known as Optical Character Recognition[2] or OCR. Through this project we implement detection and recognition of the character sequences from images.

Method

An overview of implemented method is illustrated in the figure below.

Steps involved in realizing this project are:

Create dataset
Train the model
Testing

Create dataset

We used monogram word list in the dataset provided by Mythic AI[3] to create the images. Images are generated using cairocffi library[4] by taking monogram words and randomly capitalize them and append numbers in before and after the each word. Also, choose random font size, location and rotation. Samples of generated images are given below

Train the model

We use architecture followed in the OCR example provided in the Official Keras GitHub repository[5]. The model structure is given below.

The output from network having size (32x64). Here the 32 represents the number of timesteps and 64 represents the possible characters(26 small letters, 26 capital letters, 10 digits, space and blank). Since the text could be anywhere in the space, to locate each character in the image we use Connectionist Temporal Classification(CTC)[6]. We have 32 timesteps. But the word may not have that much length. CTC using blank token to separate the characters. It helps to merge repeating characters without separated by blank[7].

In the decoding phase, following actions take place.

Collapse repeating characters without separated by blank. So, as in given below, two ‘o’s will get merged as they are not separated by blank.
Then remove blank tokens.

Testing

Some sample images and detected characters are given below.

To Run

First run ocr_data_gen.ipynb to generate dataset.
Split the dataset to train set and test set.
Create model using ocr_main.ipynb and test the model using ocr_result.ipynb.

References

[1] Singh, Amarjot, Ketan Bacchuwar, and Akshay Bhasin. "A survey of OCR applications." International Journal of Machine Learning and Computing 2.3 (2012): 314.

[2] Mori, Shunji, Hirobumi Nishida, and Hiromitsu Yamada. Optical character recognition. John Wiley & Sons, Inc., 1999.

[3] Mythic-ai.com. (2019). Index of /datasets. [online] Available at: http://www.mythic-ai.com/datasets/ [Accessed 26 Jan. 2019].

[4] Cairocffi.readthedocs.io. (2019). cairocffi — cairocffi 0.9.0 documentation. [online] Available at: https://cairocffi.readthedocs.io/en/stable/ [Accessed 26 Jan. 2019].

[5] GitHub. (2019). keras-team/keras. [online] Available at: https://github.com/keras-team/keras/tree/master/examples [Accessed 26 Jan. 2019].

[6] En.wikipedia.org. (2019). Connectionist temporal classification. [online] Available at: https://en.wikipedia.org/wiki/Connectionist_temporal_classification [Accessed 26 Jan. 2019].

[7] Zhang, C. (2019). Keras model to recognize text with variable length. [online] www.dlology.com. Available at: https://goo.gl/zpe13s [Accessed 26 Jan. 2019].

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
src/main		src/main
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CSR: Character Sequence Recognition from Images

OCR using Deeplearning

Introduction

Method

Create dataset

Train the model

Testing

To Run

References

About

Uh oh!

Releases

Packages

Languages

Lal4Tech/OCR-Deep-Learning

Folders and files

Latest commit

History

Repository files navigation

CSR: Character Sequence Recognition from Images

OCR using Deeplearning

Introduction

Method

Create dataset

Train the model

Testing

To Run

References

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages