Signex is open source signature & stamp recognition tool, that uses YOLOv7-based model for signature detection and EfficientNet v2 S for signature embeddings.
Here you can find our Swagger API description.
Signature & stamp recognition is a valuable tool in various domains, including banking, legal, and security applications. This architecture provides a framework for building a signature recognition system using machine learning algorithms.
To run the signature recognition architecture, the following requirements should be fulfilled:
- Linux or Windows machine, this project was not tested on Mac
- GPU and CUDA 11.8 for training
Click to expand and follow these steps
-
If you are going to train custom models, install CUDA 11.8
-
Clone the repository:
git clone --depth 1 --recurse-submodules https://github.com/ATMI/Signex.git
-
Navigate to the project:
cd signature-recognition
-
Create Python virtual environment:
python -m venv venv
-
Activate venv:
Linux:
. venv/bin/activate
Windows:
venv\Scripts\activate
-
Install the requirements:
pip install -r requirements.txt
git clone --depth 1 --recurse-submodules https://github.com/ATMI/Signex.git
cd Signex
python -m venv venv
. venv/bin/activate
pip install -r requirements.txt
git clone --depth 1 --recurse-submodules https://github.com/ATMI/Signex.git
cd Signex
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
To run trained Neural Network execute the following command:
cd detection
python ../yolov7/detect.py --weights weights/best.pt --conf 0.5 --img-size 640 --source images_dir
As for now comparison model is only available in the comparison/main.ipynb. You can test existing model running the following cell:
MODEL = torch.load("weights/model.pt")
TEST_DATASET = TriDataset(TEST_PATH, transform=TRANSFORM, display=True)
test(MODEL, DEVICE, TEST_DATASET)
Do not forget to run previous cells except training one:
MODEL = train(DEVICE)
torch.save(MODEL, "model.pt")
- cfg - neural network configurations folder
- data - dataset configurations folder
- hyp - hyperparameters for training neural network, such as learning rate, augmentation strategies, etc.
To train your custom model:
-
Collect images for the dataset
-
Convert all images to
.jpg
format and check their validity -
Label the data, we recommend to use YOLO Label. This project uses standard YOLO labels format:
<class_id> <cx> <cy> <w> <h>
Where:
class_id
- marked object class/type, β[0, # of clases]
cx
-x
coordinate of the bounding box center, β[0, 1]
cy
-y
coordinate of the bounding box center, β[0, 1]
w
-width
of the bounding box, β[0, 1]
h
-height
of the bounding box, β[0, 1]
Note:
cx
,cy
,w
, andh
are values relative to the corresponding image dimensions -
Put or symlink all images and labels in the dataset/images and dataset/labels folders. Each label file name should correspond to the image file:
image_1.jpg <-> image_1.txt ball.jpg <-> ball.txt
-
Change classes number in cfg/net.yaml:
nc: 2 # number of classes
-
Create dataset/train.lst and dataset/test.lst files, that will contain paths to the training and testing images. You can use
shufflels
tool to create them automatically:- Build
shufflels
:g++ shufflels.cpp -o shufflels
- Run
shufflels
:cd dataset ./shufflels images jpg 80
- Build
-
You can specify custom train.lst and test.lst paths in the data/data.yaml file:
train: dataset/list.lst # path to images list used for training val: dataset/list.lst # path to images list used for testing
-
Specify number and names of the classes in the data/data.yaml file:
nc: 2 # number of classes in the dataset names: ['signature', 'stamp'] # names of the classes
-
Optionally you can modify hyperparameters in hyp/hyp.net.yaml
-
Start training
cd detection python ../yolov7/train.py --workers 8 --device 0 --batch-size 64 --data data/data.yaml --img 640 640 --cfg cfg/net.yaml --weights weights/best.pt --name net --hyp hyp/hyp.net.yaml
To train and test comparison model you can run the comparison/main.ipynb. The training dataset should be placed under comparison/dataset/train, each sub-folder should contain different variants of the same signature. The testing data should be placed under comparison/dataset/test:
dataset/
βββ train/
β βββ 1/
β β βββ variant_1.jpg
β β βββ variant_2.jpg
β βββ 2/
β βββ variant_1.jpg
β βββ variant_2.jpg
βββ test/
βββ 3/
β βββ variant_1.jpg
β βββ variant_2.jpg
βββ 4/
βββ variant_1.jpg
βββ variant_2.jpg
To test the training model run:
cd detection
python ../yolov7/test.py --weights weights/best.pt --img-size 640 --data data/data.yaml
With the latest model we have obtained the following results (all images were not included in the training dataset):
Class | Images | Labels | Precision | Recall | [email protected] | [email protected]:.95 |
---|---|---|---|---|---|---|
all | 696 | 1393 | 0.969 | 0.926 | 0.966 | 0.676 |
signature | 696 | 935 | 0.953 | 0.92 | 0.965 | 0.56 |
stamp | 696 | 458 | 0.985 | 0.932 | 0.966 | 0.791 |
Confusion matrix:
Currently, see the Training section
To start an API, you need to run:
python api.py
By default, it listens to 8080
port and loads detection/weights/best.pt weights for the detector.
We welcome contributions to enhance the signature recognition architecture. If you would like to contribute, please follow these steps:
-
Fork the repository on GitLab.
-
Create a new branch with the name
feature/feature_name
for your feature or bug fix. -
Implement your changes or additions.
-
Commit and push your changes to your forked repository.
-
Submit a merge request, clearly describing the changes you have made.
Signex is still under development and the following tasks have to be done:
- Development of in-stamp signatures extraction model. Our model is also trained to detect stamps for their potential further processing. We can try to find signatures inside stamps to improve signature detection accuracy, as current accuracy may seem relative low:
- Further comparison model training and API method implementation.
Signex is licensed under the WTFPL.