seetaface

shelvsky · Sep 13, 2016 · caa0323 · caa0323
commit caa0323
Show file tree

Hide file tree

Showing 249 changed files with 10,045 additions and 0 deletions.
diff --git a/FaceAlignment/README.md b/FaceAlignment/README.md
@@ -0,0 +1,72 @@
+## SeetaFace Alignment
+
+[![License](https://img.shields.io/badge/license-BSD-blue.svg)](../LICENSE)
+
+### Description
+Instead of a straightforward application of deep network, SeetaFace Alignment implements a Coarse-to-Fine Auto-encoder
+Networks (CFAN) approach, which cascades a few Stacked Auto-encoder Networks (SANs) to progressively approach the accurate locations of the facial landmarks. The algorithm details can be found in our ECCV-2014 paper [CFAN](#citation). The released SeetaFace Alignment is trained with more than 23,000 images and can accurately detect five facial landmarks, i.e., two eye centers, nose tip and two mouth corners. Please note that this implementation is slightly different from that described in the corresponding paper: only two stages are cascaded for the purpose of higher speed (more than 200 fps on I7 desktop CPU). 
+
+SeetaFace Alignment is implemented for running on CPU with no dependence on any third-party libraries. Currently it is only tested on Windows, but it does not include any Windows-specific headers. Versions for more platforms, e.g., Linux, will be released in the future. The open source is released under BSD-2 license (see [LICENSE](../LICENSE)), which means the codes can be used freely for both acedemic purpose and industrial products.
+
+### Performance Evaluation
+
+To evaluate the performance of SeetaFace Alignment, experiments are conducted on [AFLW](http://lrs.icg.tugraz.at/research/aflw/), following the protocol published in [3]. The mean alignment errors normalized by the inter-ocular distance are shown in the following figure. As you can see, our SeetaFace Alignment achieves better accuracy than comparative methods.
+
+![aflw_nrmse](./doc/aflw_nrmse.png)
+
+Where LE, RE, N, LM, RM denote the left eye center, the right eye center, the nose tip, left mouth corner and right mouth corner respectively.
+
+> [1] Xuehan Xiong, Fernando De la Torre. Supervised descent method and its applications to face alignment. CVPR 2013
+
+> [2] Yi Sun, Xiaogang Wang, Xiaoou Tang. Deep Convolutional Network Cascade for Facial Point Detection. CVPR 2013
+
+> [3] Zhanpeng Zhang, Ping Luo, Chen Change Loy, Xiaoou Tang. Facial Landmark Detection by Deep Multi-task Learning. ECCV 2014
+
+As for speed, it takes about 5 milliseconds per face to predict the 5 facial points, given a face bounding box reported by SeetaFace Detector, running on a single Intel 3.4GHz i7-3770 CPU with no parallel computing.
+
+### Build Shared Lib with Visual Studio
+
+1. Create a dll project: New Project -> Visual C++ -> Win32 Console Application -> DLL.
+2. *(Optional) Create and switch to x64 platform.*
+3. Add [header files](./include): all `*.h` files in `include`.
+4. Add [source files](./src): all `*.cpp` files in `src` except for those in `src/test`.
+5. Define `SEETA_EXPORTS` macro: (Project) Properities -> Configuration Properties -> C/C++ -> Preprocessor -> Preprocessor Definitions.
+6. Build.
+
+### How to run SeetaFace Alignment
+
+This version is developed to detect five facial landmarks, i.e., two eyes' centers, nose tip and two mouth corners.
+To detect these facial landmarks, one should first instantiate an object of `seeta::FaceAlignment` with path of the model file.
+
+```c++
+seeta::FaceAlignment landmark_detector("seeta_fa_v1.0.dat");
+```
+
+Then one can call `PointDetectLandmarks(ImageData gray_im, FaceInfo face_info, FacialLandmark *points)` to detect landmarks.
+
+```c++
+seeta::ImageData image_data(width, height);
+image_data.data = image_data_buf;
+image_data.num_channels = 1;
+seeta::FaceInfo face_bbox;
+seeta::FacialLandmark points[5];
+landmark_detector.PointDetectLandmarks(image_data, face_bbox, points);
+```
+
+Where **image_data** denotes an input gray image, **face_bbox** is the face bouding box detected by [Seeta - Face Detection] (https://github.com/seetaface/SeetaFaceEngine/tree/master/FaceDetection),
+The landmarks detection results are returned in **points**. An example can be found in file [face_alignment_test.cpp](./src/test/face_alignment_test.cpp).
+
+### Citation
+
+If you use the code in your work, please consider citing our work as follows:
+
+    @inproceedings{zhang2014coarse,
+    title={Coarse-to-fine auto-encoder networks (cfan) for real-time face alignment},
+    author={Zhang, Jie and Shan, Shiguang and Kan, Meina and Chen, Xilin},
+    booktitle={European Conference on Computer Vision},
+    year={2014},
+    organization={Springer}}
+
+### License
+
+SeetaFace Alignment is released under the [BSD 2-Clause license](../LICENSE).
diff --git a/FaceAlignment/doc/aflw_nrmse.png b/FaceAlignment/doc/aflw_nrmse.png
diff --git a/FaceAlignment/include/cfan.h b/FaceAlignment/include/cfan.h
@@ -0,0 +1,118 @@
+/*
+ *
+ * This file is part of the open-source SeetaFace engine, which includes three modules:
+ * SeetaFace Detection, SeetaFace Alignment, and SeetaFace Identification.
+ *
+ * This file is part of the SeetaFace Alignment module, containing codes implementing the
+ * facial landmarks location method described in the following paper:
+ *
+ *
+ *   Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment, 
+ *   Jie Zhang, Shiguang Shan, Meina Kan, Xilin Chen. In Proceeding of the
+ *   European Conference on Computer Vision (ECCV), 2014
+ *
+ *
+ * Copyright (C) 2016, Visual Information Processing and Learning (VIPL) group,
+ * Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China.
+ *
+ * The codes are mainly developed by Jie Zhang (a Ph.D supervised by Prof. Shiguang Shan)
+ *
+ * As an open-source face recognition engine: you can redistribute SeetaFace source codes
+ * and/or modify it under the terms of the BSD 2-Clause License.
+ *
+ * You should have received a copy of the BSD 2-Clause License along with the software.
+ * If not, see < https://opensource.org/licenses/BSD-2-Clause>.
+ *
+ * Contact Info: you can send an email to [email protected] for any problems.
+ *
+ * Note: the above information must be kept whenever or wherever the codes are used.
+ *
+ */
+
+#pragma once
+#include "math.h"
+#include "SIFT.h"
+#include "common.h"
+
+class CCFAN{
+ public:
+  /** A constructor.
+   *  Initialize basic parameters.
+   */
+  CCFAN(void);
+
+  /** A destructor which should never be called explicitly.
+   *  Release all dynamically allocated resources.
+   */
+  ~CCFAN(void);
+
+  /** Initialize the facial landmark detection model.
+    *  @param model_path Path of the model file, either absolute or relative to
+    *                   the working directory.
+    */
+  void InitModel(const char *model_path);
+
+  /** Detect five facial landmarks, i.e., two eye centers, nose tip and two mouth corners.
+    *  @param gray_im A grayscale image
+    *  @param im_width The width of the inpute image
+    *  @param im_height The height of the inpute image
+    *  @param face_loc The face bounding box
+    *  @param[out] facial_loc The locations of detected facial points
+    */
+  void FacialPointLocate(const unsigned char *gray_im, int im_width, int im_height, seeta::FaceInfo face_loc, float *facial_loc);
+
+ private:
+  /** Extract shape indexed SIFT features.
+    *  @param gray_im A grayscale image
+    *  @param im_width The width of the inpute image
+    *  @param im_height The height of the inpute image
+    *  @param face_shape The locations of facial points
+    *  @param patch_size The size of the patch used for extracting SIFT feature
+    *  @param[out] sift_fea the extracted shape indexed SIFT features which are concatenated into a vector
+    */
+  void TtSift(const unsigned char *gray_im, int im_width, int im_height, float *face_shape, int patch_size, double *sift_fea);
+
+  /** Extract a image patch which is centered at point(point_x, point_y) with a given patch size.
+  *  @param gray_im A grayscale image
+  *  @param im_width The width of the inpute image
+  *  @param im_height The height of the inpute image
+  *  @param point_x The X coordinate of one point
+  *  @param point_y The Y coordinate of one point
+  *  @param patch_size The size of the extracted patch
+  *  @param[out] sub_img A grayscale image patch
+  */
+  void GetSubImg(const unsigned char *gray_im, int im_width, int im_height, float point_x, float point_y, int patch_size, BYTE *sub_img);
+
+  /** Resize the image by bilinear interpolation.
+    *  @param src_im A source image in grayscale
+    *  @param src_width The width of the source image
+    *  @param src_height The height of the source image
+    *  @param[out] dst_im The target image in grayscale
+    *  @param dst_width The width of the target image
+    *  @param dst_height The height of the target image
+    */
+  bool ResizeImage(const unsigned char *src_im, int src_width, int src_height,
+    unsigned char* dst_im, int dst_width, int dst_height);
+
+ private:
+  /*The number of facial points*/
+  int pts_num_;
+  /*The dimension of the shape indexed features*/
+  int fea_dim_;
+  /*The mean face shape containing five landmarks*/
+  float *mean_shape_;
+
+  /*The parameters of the first local stacked autoencoder network*/
+  float **lan1_w_;
+  float **lan1_b_;
+  int *lan1_structure_;
+  int lan1_size_;
+
+  /*The parameters of the second local stacked autoencoder network*/
+  float **lan2_w_;
+  float **lan2_b_;
+  int *lan2_structure_;
+  int lan2_size_;
+
+};
+
diff --git a/FaceAlignment/include/common.h b/FaceAlignment/include/common.h
@@ -0,0 +1,106 @@
+/*
+ *
+ * This file is part of the open-source SeetaFace engine, which includes three modules:
+ * SeetaFace Detection, SeetaFace Alignment, and SeetaFace Identification.
+ *
+ * This file is part of the SeetaFace Alignment module, containing codes implementing the
+ * facial landmarks location method described in the following paper:
+ *
+ *
+ *   Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment, 
+ *   Jie Zhang, Shiguang Shan, Meina Kan, Xilin Chen. In Proceeding of the
+ *   European Conference on Computer Vision (ECCV), 2014
+ *
+ *
+ * Copyright (C) 2016, Visual Information Processing and Learning (VIPL) group,
+ * Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China.
+ *
+ * The codes are mainly developed by Jie Zhang (a Ph.D supervised by Prof. Shiguang Shan)
+ *
+ * As an open-source face recognition engine: you can redistribute SeetaFace source codes
+ * and/or modify it under the terms of the BSD 2-Clause License.
+ *
+ * You should have received a copy of the BSD 2-Clause License along with the software.
+ * If not, see < https://opensource.org/licenses/BSD-2-Clause>.
+ *
+ * Contact Info: you can send an email to [email protected] for any problems.
+ *
+ * Note: the above information must be kept whenever or wherever the codes are used.
+ *
+ */
+
+#ifndef SEETA_COMMON_H_
+#define SEETA_COMMON_H_
+
+#include <cstdint>
+
+#if defined (_MSC_VER) || defined (_WIN32) || defined (_WIN64)
+  #ifndef SEETA_API
+    #define  SEETA_API __declspec(dllexport)
+  #else
+    #define  SEETA_API __declspec(dllimport)
+  #endif // SEETA_API
+#else // defined (windows)
+ #define SEETA_API
+#endif
+
+#define DISABLE_COPY_AND_ASSIGN(classname) \
+ private: \
+  classname(const classname&); \
+  classname& operator=(const classname&)
+
+#ifdef USE_OPENMP
+#include <omp.h>
+
+#define SEETA_NUM_THREADS 4
+#endif
+
+namespace seeta {
+
+  typedef struct ImageData {
+    ImageData() {
+      data = nullptr;
+      width = 0;
+      height = 0;
+      num_channels = 0;
+    }
+
+    ImageData(int32_t img_width, int32_t img_height,
+      int32_t img_num_channels = 1) {
+      data = nullptr;
+      width = img_width;
+      height = img_height;
+      num_channels = img_num_channels;
+    }
+
+    uint8_t* data;
+    int32_t width;
+    int32_t height;
+    int32_t num_channels;
+  } ImageData;
+
+  typedef struct Rect {
+    int32_t x;
+    int32_t y;
+    int32_t width;
+    int32_t height;
+  } Rect;
+
+  typedef struct FaceInfo {
+    seeta::Rect bbox;
+
+    double roll;
+    double pitch;
+    double yaw;
+
+    double score; /**< Larger score should mean higher confidence. */
+  } FaceInfo;
+
+  typedef struct {
+    double x;
+    double y;
+  } FacialLandmark;
+}  // namespace seeta
+
+#endif  // SEETA_COMMON_H_
+
diff --git a/FaceAlignment/include/face_alignment.h b/FaceAlignment/include/face_alignment.h
@@ -0,0 +1,67 @@
+/*
+ *
+ * This file is part of the open-source SeetaFace engine, which includes three modules:
+ * SeetaFace Detection, SeetaFace Alignment, and SeetaFace Identification.
+ *
+ * This file is part of the SeetaFace Alignment module, containing codes implementing the
+ * facial landmarks location method described in the following paper:
+ *
+ *
+ *   Coarse-to-Fine Auto-Encoder Networks (CFAN) for Real-Time Face Alignment, 
+ *   Jie Zhang, Shiguang Shan, Meina Kan, Xilin Chen. In Proceeding of the
+ *   European Conference on Computer Vision (ECCV), 2014
+ *
+ *
+ * Copyright (C) 2016, Visual Information Processing and Learning (VIPL) group,
+ * Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China.
+ *
+ * The codes are mainly developed by Jie Zhang (a Ph.D supervised by Prof. Shiguang Shan)
+ *
+ * As an open-source face recognition engine: you can redistribute SeetaFace source codes
+ * and/or modify it under the terms of the BSD 2-Clause License.
+ *
+ * You should have received a copy of the BSD 2-Clause License along with the software.
+ * If not, see < https://opensource.org/licenses/BSD-2-Clause>.
+ *
+ * Contact Info: you can send an email to [email protected] for any problems.
+ *
+ * Note: the above information must be kept whenever or wherever the codes are used.
+ *
+ */
+
+#ifndef SEETA_FACE_ALIGNMENT_H_
+#define SEETA_FACE_ALIGNMENT_H_
+
+#include "common.h"
+class CCFAN;
+
+namespace seeta {
+class FaceAlignment{
+ public:
+  /** A constructor with an optional argument specifying path of the model file.
+  *  If called with no argument, the model file is assumed to be stored in the
+  *  the working directory as "seeta_fa_v1.0.bin".
+  *
+  *  @param model_path Path of the model file, either absolute or relative to
+  *  the working directory.
+  */
+  SEETA_API FaceAlignment(const char* model_path = NULL);
+
+  /** A Destructor which should never be called explicitly.
+  *  Release all dynamically allocated resources.
+  */
+  SEETA_API ~FaceAlignment();
+
+  /** Detect five facial landmarks, i.e., two eye centers, nose tip and two mouth corners.
+  *  @param gray_im A grayscale image
+  *  @param face_info The face bounding box
+  *  @param[out] points The locations of detected facial points
+  */
+  SEETA_API bool PointDetectLandmarks(ImageData gray_im, FaceInfo face_info, FacialLandmark *points);
+
+ private:
+  CCFAN *facial_detector;
+};
+}  // namespace seeta
+
+#endif