tntrung · atranitell · Aug 15, 2015 · Aug 15, 2015 · Aug 15, 2015
diff --git a/README.md b/README.md
@@ -1,89 +1,84 @@
-# Matlab Implementation of Supervised Descent Method
+# Matlab Implementation （version 2.0）of Supervised Descent Method
 
-A Matlab implementation of Supervised Descent Method (SDM) for Face
-Alignment.
+本版本是上一版的修订版，我们仔细阅读了原来的代码，发现了一些问题，最后我们对源代码做了一些修正，主要包括：
+- 修复了代码运行中出现的一些bug
+- 添加了一些函数，使代码更简洁
 
-We provide both training and testing modules. It is under development for 
-an improvement version: Global Supvervised Descent Method (GSDM).
-
-The ogirinal paper: 
-
-Xiong et F. De la Torre, 
-Supervised Descent Method and its Applications to Face Alignment, 
-CVPR 2013.
+- 参考论文[《Extended Supervised Descent Method for Robust Face Alignment》][1]，优化了源程序
+- 在测试阶段，我们使用了逆的缩放和平移变换将得到的aligned_shape
+转换为原始图片的特征点true_shape
+- 添加了详细的注释，使之更容易明白。
 
 ===========================================================================
 
 # Dependency:
-   - Vlfeat library: http://www.vlfeat.org/
-   - libLinear: www.csie.ntu.edu.tw/~cjlin/liblinear/
+  - Vlfeat library: http://www.vlfeat.org/
+
+     提供了hog/sift特征函数，程序默认使用hog特征，如果要使用sift特征，你可以使用xx_sift.m提供的接口（见commom/desc/xx_sift.m）.如果使用Vlfeat的sift,你需要修改程序。因为程序默认的sift接口为xx_sift.m
+  - libLinear:  http://www.csie.ntu.edu.tw/~cjlin/liblinear/
 
+     使用其提供的svm方法计算超定方程组的根
+  - mexopencv: https://github.com/kyamagu/mexopencv
+
+     使用其提供的人脸检测程序（不过程序中我们一般以ground_truth的特征点的包围盒替代，更准确）
 # Datasets in use:
 
 [300-W] http://ibug.doc.ic.ac.uk/resources/facial-point-annotations/
 
-# How to use:
+该数据集仅提供了68个特征点的数据，也就是w300类型的数据
 
-1. Download 300-W data (i.e. LFPW) from above link and put into "./data" 
-   folder, then correct the dataset path to your dataset foler in setup.m
+# How to use:
 
-   >> mkdir -p data
+1. 从以上链接中下载300-W数据（i.e. LFPW），并放在"./data" 文件夹下。
+   然后纠正setup.m中的数据集的路径
 
    For example:
 
-	options.trainingImageDataPath = './data/lfpw/trainset/';
+  options.trainingImageDataPath = './data/lfpw/trainset/';
 
-	options.trainingTruthDataPath = './data/lfpw/trainset/';
+  options.trainingTruthDataPath = './data/lfpw/trainset/';
 
-	options.testingImageDataPath  = './data/lfpw/testset/';
+  options.testingImageDataPath  = './data/lfpw/testset/';
 
-	options.testingTruthDataPath  = './data/lfpw/testset/';
+  options.testingTruthDataPath  = './data/lfpw/testset/';
 
 2. Download and install dependencies: libLinear, Vlfeat, mexopencv, put
    into "./lib" folder and compile if necessary. Make sure you already 
    addpath(...) all folders in matlab. 
    Check and correct the library path in setup.m.
 
-   >> mkdir -p lib
+   安装方法分别见：
 
-   libLinear: 
-     - Open Matlab
-     - Go to i.e. lib/liblinear-1.96/matlab/ in Matlab editor.
-     - Run make.m to comile *.mex files.
-
-   Vlfeat:
-     - >> cd lib/vlfeat/ && make
-     - cd ./toolbox in Matlab editor.
-     - Run vl_setup
-     - Compile mex Hog functions:
-       >> cd misc
-       >> mex -L../../bin/glnx86 -lvl -I../ -I../../ vl_hog.c
-     - Setup libvl.so path.
-     - Assume that your libvl.so located at: <vlfeat_folder>/bin/glnx86
-       Create soft link:
-       >> ln -s <vlfeat_folder>/bin/glnx86/libvl.so /usr/local/libvl.so
-       Check if the libvl.so is ready to use.
-       >> ldd vl_hog.mexglx
-       If libvl.so still not found.
-       Add /usr/local/lib into /etc/ld.so.conf (sudo).
-       >> sudo ldconfig
-       >> ldconfig -p | grep libvl.so
-       Check again: >> ldd vl_hog.mexglx
+   libLinear：http://m.blog.csdn.net/blog/tiandijun/40929563
+
+   Vlfeat：http://www.cnblogs.com/woshitianma/p/3872939.html
+
+   mexopencv：http://wangcaiyong.com/2015/07/14/mexopencv/
 
-
 3. If you run first time. You should set these following parameters
    to learn shape and variation. For later time, reset to 0.
 
    options.learningShape     = 1;
+
    options.learningVariation = 1;
 
+  说明：第一个变量**learningShape**学习了数据集的平均特征点；第二个变量**learningVariation**学习了true_shape与mean_shape的包围盒之间差值（一个box包含四个变量x,y,width,height）的均值和方差，后期用在扰动产生更多的初始特征值.
+
 4. Do training:
    >> run_training();
 
 5. Do testing:
    >> do_testing();
 
-
-Note: in the program, we provide training models of LFPW (68 landmarks) in folder:
-"./model". The program does not optimize speed and memory during training, the 
-memory problem may happens if you train on too much data.
+6. 遗憾的是，我们还是没有真正对程序优化内存和速度，我们在程序运行中发现，占内存最严重的变量是storage_init_desc（特征向量矩阵），试想以LFBW为例，训练集共有811张图片，如我们扰动初始值10次，将会产生8110个shape,若使用sift特征（<img src="http://latex.codecogs.com/gif.latex?4*4*8=128" /> 维），加之特征点数为68，则storage_init_desc的维数将是<img src="http://latex.codecogs.com/gif.latex?8704*8110" /> ，对其使用SVM方法，程序跑不动，内存占满。
+7. 新增函数列表
+ - /common/cropImage/cropImage.m
+ - /common/desc/xx_sift.m
+ - /common/flip/flipImage.m
+ - /common/io/write_w300_shape.m
+ - /source/train/learn_single_regressor2.m
+8. 关于上述修正细节请参考系列博文[《Some improvements about SDM for face alignment 》][2]
+
+
+  [1]: https://dn-xiamenwcy.qbox.me/sdm/Extended%20Supervised%20Descent%20Method%20for%20Robust%20Face%20Alignment.pdf
+  [2]: http://wangcaiyong.com/2015/08/14/sdm/
diff --git a/common/align/random_init_position.m b/common/align/random_init_position.m
@@ -1,5 +1,5 @@
 function [rbbox] = random_init_position( bbox, ...
-                                                 DataVariation, nRandInit )
+                                                 DataVariation, nRandInit,options )
 
 rbbox(1,:) = bbox;    
 
@@ -31,7 +31,9 @@
 
 rbbox(2:nRandInit,1:2) = rCenter - [rWidth(:,1) rHeight(:,1)]/2;
 rbbox(2:nRandInit,3:4) = [rWidth(:,1) rHeight(:,1)];
-
+%�������ֹ�Ŷ�����ͼƬ�ı߽�
+rbbox(1:nRandInit,1:2)=max(rbbox(1:nRandInit,1:2),1);
+rbbox(1:nRandInit,1:2)=min(rbbox(1:nRandInit,1:2)+rbbox(1:nRandInit,3:4),options.canvasSize(1) )-rbbox(1:nRandInit,3:4);
 end
 
 end
diff --git a/common/cropImage/cropImage.m b/common/cropImage/cropImage.m
@@ -0,0 +1,87 @@
+function [ img,shape,box,t ] = cropImage( img,shape )
+%��ȡͼ���е���������
+%����ȡ�����򳬹�ͼ���С����ʹ�ñ߽����
+%����shape,����box,�ͻҶ�ͼ
+
+
+    %% get bounding box
+   box= getbbox(shape);
+
+    %% enlarge region of face
+    region     = enlargingbbox(box, 2.0);
+
+    region_y  = double(max(region(2), 1));
+    region_x  = double(max(region(1), 1));
+
+ if 0
+    disp('before cropping Image...');
+    figure(1); imshow(img); hold on;
+    draw_shape(shape(:,1),...
+        shape(:,2),'r');
+    hold on;
+    rectangle('Position',  box, 'EdgeColor', 'y');
+    rectangle('Position',  region, 'EdgeColor', 'g');
+    hold off;
+    %pause;
+end   
+
+
+   bottom_y   = double(min(region(2) + region(4) - 1, ...
+         size(img,1)));
+   right_x    = double(min(region(1) + region(3) - 1, ...
+       size(img,2))); 
+
+    img_region = img(region_y:bottom_y, region_x:right_x, :);
+    if size(img_region,3)>1
+         img_region=rgb2gray(img_region);
+    end
+      [M,N]=size(img_region);
+
+    c1=max(1-region(1),0);
+    r1=max(1-region(2),0);
+
+ %  imshow(img_region);
+    img_region= padarray(img_region,[r1,c1],'replicate','pre'); %���ͼ��ʹͼ�������м䣬��ֹĳЩͼ����ڿ������ϱ߽�
+
+     [M,N]=size(img_region);
+
+    r2=max(region(4)-M,0);
+    c2=max(region(3)-N,0);;
+    img_region= padarray(img_region,[r2,c2],'replicate','post'); %���ͼ��ʹͼ�������м䣬��ֹĳЩͼ����ڿ������±߽�
+  %  imshow(img_region);
+
+    img=img_region;
+    %% recalculate the location of groundtruth shape and bounding box
+
+%     shape = bsxfun(@minus, shape,...
+%         double([region_x-c1 region_y-r1]));
+    shape = bsxfun(@minus, shape,...
+        double([region_x-c1-1 region_y-r1-1]));
+    box = getbbox(shape);
+    t=[region_x-c1-1 region_y-r1-1];
+
+
+
+if 0
+    disp('after cropping Image...');
+    figure(2); imshow(img); hold on;
+    draw_shape(shape(:,1),...
+        shape(:,2),'g');
+    rectangle('Position',  box, 'EdgeColor', 'r');
+    hold off;
+    pause;
+end
+
+
+
+
+end
+function region = enlargingbbox(bbox, scale)
+
+region(1) = floor(bbox(1) - (scale - 1)/2*bbox(3));
+region(2) = floor(bbox(2) - (scale - 1)/2*bbox(4));
+
+region(3) = floor(scale*bbox(3));
+region(4) = floor(scale*bbox(4));
+
+end
diff --git a/common/desc/hog.m b/common/desc/hog.m
@@ -1,5 +1,6 @@
 function desc = hog( im, pos , lmsize )
 
+
 %fsize  = sqrt(norm_size);
 %lmsize  = fsize;
 %gsize = options.canvasSize(1) * options.descScale(1);
@@ -24,7 +25,8 @@
      cropim = imresize(cropim,[lmsize lmsize]);
 end
 
-cellSize = 32 ;
+%cellSize = 32;
+cellSize=round(lmsize/2);
 %tmp = vl_hog(single(cropim), cellSize, 'verbose');
 tmp = vl_hog(single(cropim), cellSize);
 

diff --git a/common/desc/xx_sift.m b/common/desc/xx_sift.m
@@ -0,0 +1,29 @@
+% Signature: 
+%   X = xx_sift(im,lms,'nsb',nsb,'winsize',winsize)
+%
+% Dependence:
+%   None
+%
+% Usage:
+%   This function implements an approximation of SIFT descriptors. It
+%   extracts descriptors on the local patches around each landmarks. This
+%   is the fastest SIFT descriptor implementation available. 
+%
+% Params:
+%   im - input image (must be in double grayscale)
+%   lms(nx2) - input landmark (must be in double) 
+%   nsb(option) - the number of spatial bins, default 4
+%   winsize(option) - patch size, default 32 
+%
+% Return:
+%   X - computed descriptors in single, default size: 128 x n
+% 
+% Authors: 
+%   Xuehan Xiong, [email protected]
+%
+% Citation: 
+%   Xuehan Xiong, Fernando de la Torre, Supervised Descent Method and Its
+%   Application to Face Alignment. CVPR, 2013
+%
+% Creation Date: 10/7/2013
+%
diff --git a/common/desc/xx_sift.mexw64 b/common/desc/xx_sift.mexw64
diff --git a/common/flip/flipImage.m b/common/flip/flipImage.m
@@ -0,0 +1,37 @@
+function [img,shape]=flipImage(img,shape)
+%����������������תͼƬ�������㣬��������������������д��ͼƬ�ķ���
+%shape=annotation_load('*.pts','w300');%����shape
+%img=imread('*.png');
+%imwrite(imgmat,'*.png');%д��ͼƬ
+%./common/io/write_w300_shape.m��������ΰ�shapeд���ļ��С�
+%flip image & shape
+if size(img,3) > 1
+    img_gray   = fliplr(rgb2gray(uint8(img)));
+else
+    img_gray   = fliplr(img);
+end
+
+if 0
+    disp('before flipping...');
+    figure(1); imshow(img); hold on;
+    draw_shape(shape(:,1),...
+        shape(:,2),'r');
+    hold off;
+   % pause;
+end
+clear img;
+img = img_gray;
+clear img_gray;
+
+shape = flipshape(shape);
+shape(:,1) = size(img,2)+1 - shape(:, 1);
+
+
+if 0
+    disp('after flipping...');
+    figure(1); imshow(img); hold on;
+    draw_shape(shape(:,1),...
+        shape(:,2),'y');
+    hold off;
+   % pause;
+end
diff --git a/common/io/load_all_data2.m b/common/io/load_all_data2.m
@@ -11,6 +11,8 @@
 - shape_gt: ground-truth landmark.
 - bbox_gt: bounding box of ground-truth.
 - bbox_facedet: face detection region
+-- t translation between Iw and I
+-- s scale between Iw and I
 %}
 
 slash = options.slash;
@@ -44,30 +46,13 @@
         draw_shape(Data{iimgs}.shape_gt(:,1),...
             Data{iimgs}.shape_gt(:,2),'y');
         hold off;
-        pause;
+      %  pause;
     end    
 
 
-    %% get bounding box
-    Data{iimgs}.bbox_gt = getbbox(Data{iimgs}.shape_gt);
-
-    %% enlarge region of face
-    region     = enlargingbbox(Data{iimgs}.bbox_gt, 2.0);
-    region(2)  = double(max(region(2), 1));
-    region(1)  = double(max(region(1), 1));
-
-    bottom_y   = double(min(region(2) + region(4) - 1, ...
-        Data{iimgs}.height_orig));
-    right_x    = double(min(region(1) + region(3) - 1, ...
-        Data{iimgs}.width_orig));
-
-    img_region = img(region(2):bottom_y, region(1):right_x, :);
-
-    %% recalculate the location of groundtruth shape and bounding box
-    Data{iimgs}.shape_gt = bsxfun(@minus, Data{iimgs}.shape_gt,...
-        double([region(1) region(2)]));
 
-    Data{iimgs}.bbox_gt = getbbox(Data{iimgs}.shape_gt);
+    %% crop image
+    [ img_region,Data{iimgs}.shape_gt,Data{iimgs}.bbox_gt,Data{iimgs}.t] = cropImage( img,Data{iimgs}.shape_gt );
 
     Data{iimgs}.isdet = 0;
 
@@ -130,7 +115,7 @@
 
     Data{iimgs}.bbox_facedet(1:2) = bsxfun(@times, Data{iimgs}.bbox_facedet(1:2), [sr sc]);
     Data{iimgs}.bbox_facedet(3:4) = bsxfun(@times, Data{iimgs}.bbox_facedet(3:4), [sr sc]);
-
+    Data{iimgs}.s=[sr sc];
     %disp(size(Data{iimgs}.img_gray));
 
     if 0
@@ -147,13 +132,4 @@
 end
 
 
-function region = enlargingbbox(bbox, scale)
-
-region(1) = floor(bbox(1) - (scale - 1)/2*bbox(3));
-region(2) = floor(bbox(2) - (scale - 1)/2*bbox(4));
-
-region(3) = floor(scale*bbox(3));
-region(4) = floor(scale*bbox(4));
-
-end