This is an python implement of Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set.
The method enforces a hybrid-level weakly-supervised training to achieve accurate CNN-based 3D face reconstruction.
The method reconstructs faces with high accuracy. Quantitative evaluations (shape errors in mm) on several benchmarks show its state-of-the-art performance:
Method | FaceWareHouse | Florence | BU3DFE |
---|---|---|---|
Tewari et al. 17 | 2.19±0.54 | - | - |
Tewari et al. 18 | 1.84±0.38 | - | - |
Genova et al. 18 | - | 1.77±0.53 | - |
Sela et al. 17 | - | - | 2.91±0.60 |
PRN 18 | - | - | 1.86±0.47 |
Ours | 1.81±0.50 | 1.67±0.50 | 1.40±0.31 |
The method produces high fidelity face textures meanwhile preserves identity information of input images. Scene illumination is also disentangled to guarantee a pure albedo.
The method can provide reasonable results under extreme conditions such as large pose and occlusions.
Our method aligns reconstruction faces with input images. It provides face pose estimation and 68 facial landmarks which are useful for other tasks. We conduct an experiment on AFLW_2000 dataset (NME) to evaluate the performance, as is shown in the table below:
Method | [0°,30°] | [30°,60°] | [60°,90°] | Overall |
---|---|---|---|---|
3DDFA 16 | 3.78 | 4.54 | 7.93 | 5.42 |
3DDFA+SDM 16 | 3.43 | 4.24 | 7.17 | 4.94 |
Bulat et al. 17 | 2.47 | 3.01 | 4.31 | 3.26 |
PRN 18 | 2.75 | 3.51 | 4.61 | 3.62 |
Ours | 2.56 | 3.11 | 4.45 | 3.37 |
Faces are represented with Basel Face Model 2009, which is easy for further manipulations (e.g expression transfer). ResNet-50 is used as backbone network to achieve over 50 fps (on GTX 1080) for reconstructions.
- Python >= 3.5 (numpy, scipy, pillow, opencv)
- Tensorflow >= 1.4
- Basel Face Model 2009 (BFM09)
- Expression Basis (transferred from Facewarehouse by Guo et al.)
Optional:
- tf mesh renderer (We use it as renderer while training. Can be used at test stage too. Only on Linux.)
- Clone the repository
git clone https://github.com/Microsoft/Deep3DFaceReconstruction
cd Deep3DFaceReconstruction
-
Download the BFM09 model and put "01_MorphableModel.mat" into ./BFM subfolder.
-
Download the Expression Basis provided by Guo (You can find a link named CoarseData in the first row of Introduction part in their repository. Download and unzip the Coarse_Dataset.zip), and put "Exp_Pca.bin" into ./BFM subfolder.
-
Download the trained model at GoogleDrive, and put it into ./network subfolder.
-
Run the demo code.
python demo.py
- To check the results, see ./output subfolder which contains:
- "xxx.mat" : consists of cropped input image, corresponding 5p and 68p landmarks, and output coefficients of R-Net.
- "xxx_mesh.obj" : 3D face mesh in canonical view (best viewed in MeshLab).
-
The model is trained without augmentation so that a pre-alignment with 5 facial landmarks is necessary. We put some examples in the ./input subfolder for reference.
-
Current model is trained under the assumption of 3-channel scene illumination (instead of monochromatic lights described in the paper).
-
We exclude ear and neck region of original BFM09. To see which vertex is preserved, check select_vertex_id.mat in the ./BFM subfolder. Note that index starts from 1.
-
If you have any questions, please contact Yu Deng ([email protected]) or Jiaolong Yang ([email protected]).
Please cite the following paper if this model helps your research:
@misc{deng2019accurate,
title={Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set},
author={Yu Deng and Jiaolong Yang and Sicheng Xu and Dong Chen and Yunde Jia and Xin Tong},
year={2019},
eprint={1903.08527},
archivePrefix={arXiv},
primaryClass={cs.CV}
}