-
Notifications
You must be signed in to change notification settings - Fork 5
Label Reader API
- Architecture of the label reader API
- Easy OCR and its drawbacks
- Reading the logs
- Label_regular vs label_specific
- Output of the Python Server in label_regular
To make Optical Character Recognition (OCR), we used the easy OCR module available at the following link: https://github.com/JaidedAI/EasyOCR
EasyOCR is nice to use because the installation is quite simple and it works whether you have GPUs or not. It is very effective to read texts but can still be used to read characters that do not have a meaning (e.g: a label of a rack) even though it is a little bit less precise.
Some of the flaws of easy OCR are that it can only read from left to right/top to bottom and that it sometimes cuts labels into several parts. This was quite an issue for us as we are using regular expressions, so if the label is cut, it cannot be matched.
To solve this issue, we decided to split the regular expression into multiple parts to ensure the matching. e.g: Ro[1-3]-Rack[A-B][0-9]
--> Ro[1-3]
,Rack[A-B][0-9]
When trying to read: Ro3-RackB2
, if easy OCR reads:
Line 1: Ro3
Line 2: -
Line 3: RackB2
We will match the different parts and return [Ro3, RackB2]
.
We may mismatch some parts of the regex if they are only made from 1 or 2 characters but in the end, we catch a lot of the labels that would not have been read otherwise.
Below is an example of a log produced while reading a label
*
label_regular is set to work for any use case by not doing any correction to the text read with esayOCR. Hence, they are less precise but can be adapted to another regular expression. label_specific is including a number of corrections to the text specific to our use case. It is more precise in our case but cannot be adapted to another regular expression in an easy way. Please note that label_specific was the previous version of label_regular and thus has less detailed logs while having messier code.
The current output of the python server is a JSON with the following information:
{
"site": site,
"room": label[0],
"rack": label[1],
}
with site referring to the site provided in the Unity application and label the list containing the text corresponding to each part of the regular expression provided.
In our case, label[0]
refers to the room, and label[1]
refers to the rack.
To adapt this code, one needs to adapt the dictionary built in the ogLabelReader.py
file.