We designed a system for facial recognition using machine learning and computer vision techniques. This work was developed, with major contributions of Igor M. Quintanilha, through CPE775 - Machine Learning class taught at Federal University of Rio de Janeiro, Brazil.
The system is deep-learning-based and is composed of five main steps: face segmentation, facial features detection, face alignment, embedding, and classification. Deep learning methods is employed for fiducial points extraction and embedding. For classification, we use Support Vector Machine (SVM) is since it is fast for both training and inference. The system achieves an error rate of 0.12103 for facial features detection and 0.05 for facial recognition. Besides, it is capable to run in real-time using a webcam.
This work uses a Histogram of Oriented Gradient (HOG) based method, the Max-Margin Object Detection (MMOD), implemented by using dlib library. The segmented face is delivered to the facial feature (landmarks) extraction step.
We use ResNet-18 architecture for being employed in many state-of-the-art computer vision algorithms due to its simplicity and high generalization capability.
We apply an affine transformation to align the faces from the image in such way that the nose, eyes, and mouth are aligned with the center of the image as better as possible.
To do so, we use two functions from OpenCV library: getAffineTransform
, that returns the rotation and translation necessary to take the original points to the desired ones (an average mask calculated from the points in the training set); and the warpAffine
, that applies the transformation, which also scales the resulting image.
A network is trained to minimize the so-called triplet loss – at each iteration, the network is fed with three images: two distinct images of the same person and an image of a different person.
The last step is greatly simplified. We apply an SVM is applied to classify each vector as belonging to a person or not. The SVM was chosen since it is fast for both training and inference.
- Linux (might work with other Operational Systems, but it was not tested).
- Conda: package, dependency and environment management.
- Packages in requirements.yml.
- Open the terminal;
- Download the source code by running the following command:
git clone https://github.com/wesleylp/CPE775.git
; - Go inside the folder by typing
cd CPE775
; - Create the
face
virtual environment by running the following commandconda env create -f requirements.yml
; - Activate your environment by typing
source activate face
; To deactivate the environment type insource deactivate
. NOTE: Remember to always activate the environment before running the code; - Create the data folder:
mkdir -p data/pics
; - Download the
pre-trained models
by typingwget https://github.com/wesleylp/CPE775/releases/download/v1.0/models.tgz
and extract it by runningtar -xvzf models.tgz
; - Follow the instructions in the
notebooks
folder to better understand the pipeline or go toUsage
to see it working!
To use the real-time application:
- For each user, place a folder (named as each individual) containing at least 10 images of only this person in
/data/pics/
. Example: If you want to recognize John and Mary Place a folder named John with at least 10 photos of him and another folder named Mary with at least 10 photos of her in/data/pics/
. - Train the SVM model by executing
python register.py
- Run the application
python webcam.py
- 21-Dez-2017: Launch (class presentation)
- 21-May-2018: Improved README.md
- 20-Out-2018: Solved import issue
- 23-Sep-2018: Trick to avoid segmentation fault when import matplotlib and scikit-image
- Igor for being the major contributor to this project;
- All authors whose works were used as a reference.