GitHub - haibalabs/face-mesh-to-blendshapes

Author: James Ritts [email protected]

This notebook (mediapipe_face_mesh_to_blendshapes.ipynb) trains a simple pytorch model to map from MediaPipe face mesh landmarks to ARKit-compatible blendshapes.

Click here to open the demo.

Notes

We wish to train on object space geo so it doesn't have to learn what a face pose looks like in every possible head orientation. Unfortunately MediaPipe's output is given in a coordinate system that makes this difficult. Its mesh is also stretched to conform to the silhouette of the face in the input image. The function normalize_landmarks() tries to undo these effects: the mesh is segmented into mouth, left- and right-eye patches; then a basis built for each patch from selected quads in order to reorient the patch towards the camera; then the verts are projected to the XY plane and their components rescaled to [0, 1] for model input.
The function convert_landmarks_to_model_input() uses normalize_landmarks in order to convert from raw MediaPipe output to the NN input vector. This function needs to be ported to any environment where the model is run.
MediaPipe isn't able to signal every blendshape. These should be forced to zero at runtime and possibly others as well: jawForward, jawRight, jawLeft, mouthDimpleRight, mouthDimpleLeft, cheekPuff, tongueOut.

Format

The order of blendshape values in the model output is:

eyeBlinkRight, eyeLookDownRight, eyeLookInRight, eyeLookOutRight, eyeLookUpRight, eyeSquintRight, eyeWideRight, eyeBlinkLeft, eyeLookDownLeft, eyeLookInLeft, eyeLookOutLeft, eyeLookUpLeft, eyeSquintLeft, eyeWideLeft, jawForward, jawRight, jawLeft, jawOpen, mouthClose, mouthFunnel, mouthPucker, mouthRight, mouthLeft, mouthSmileRight, mouthSmileLeft, mouthFrownRight, mouthFrownLeft, mouthDimpleRight, mouthDimpleLeft, mouthStretchRight, mouthStretchLeft, mouthRollLower, mouthRollUpper, mouthShrugLower, mouthShrugUpper, mouthPressRight, mouthPressLeft, mouthLowerDownRight, mouthLowerDownLeft, mouthUpperUpRight, mouthUpperUpLeft, browDownRight, browDownLeft, browInnerUp, browOuterUpRight, browOuterUpLeft, cheekPuff, cheekSquintRight, cheekSquintLeft, noseSneerRight, noseSneerLeft, tongueOut

Training data has this folder structure:

my_first_dataset
- neutral.jpg
- my_first_dataset.csv
- my_first_dataset_000000.jpg
- my_first_dataset_000001.jpg
- my_first_dataset_000002.jpg
- ...
my_second_dataset
sets.txt

The file sets.txt should contain the folder names of all training datasets:

my_first_dataset
my_second_dataset
...

Each dataset must have a calibration photo depicting a neutral facial expression: neutral.jpg. The model is trained on object space offsets from the neutral pose.

Each dataset also has a CSV file containing a header row followed by labels (blendshape values) for each input image:

eyeBlinkRight,eyeLookDownRight,eyeLookInRight,eyeLookOutRight,eyeLookUpRight,eyeSquintRight,eyeWideRight,eyeBlinkLeft,eyeLookDownLeft,eyeLookInLeft,eyeLookOutLeft,eyeLookUpLeft,eyeSquintLeft,eyeWideLeft,jawForward,jawRight,jawLeft,jawOpen,mouthClose,mouthFunnel,mouthPucker,mouthRight,mouthLeft,mouthSmileRight,mouthSmileLeft,mouthFrownRight,mouthFrownLeft,mouthDimpleRight,mouthDimpleLeft,mouthStretchRight,mouthStretchLeft,mouthRollLower,mouthRollUpper,mouthShrugLower,mouthShrugUpper,mouthPressRight,mouthPressLeft,mouthLowerDownRight,mouthLowerDownLeft,mouthUpperUpRight,mouthUpperUpLeft,browDownRight,browDownLeft,browInnerUp,browOuterUpRight,browOuterUpLeft,cheekPuff,cheekSquintRight,cheekSquintLeft,noseSneerRight,noseSneerLeft,tongueOut
0.039,0.103,0.044,0.000,0.000,0.000,0.000,0.039,0.104,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.010,0.010,0.027,0.000,0.000,0.002,0.003,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.015,0.014,0.000,0.000,0.000,0.007,0.000,0.000,0.000,0.000,0.000
0.038,0.091,0.049,0.000,0.000,0.000,0.000,0.038,0.092,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.010,0.011,0.027,0.000,0.000,0.002,0.004,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.000,0.014,0.014,0.000,0.000,0.000,0.007,0.000,0.000,0.000,0.000,0.000
...

Relevant links:

To do:

find ways to reduce head rotation-driven blendshape error
cull shapes from NN output and training with which MP points don't correlate
cull training examples which don't signal shapes that MP can detect

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.html		index.html
mediapipe_face_mesh_to_blendshapes.ipynb		mediapipe_face_mesh_to_blendshapes.ipynb
package.json		package.json
packages.txt		packages.txt
python_env.txt		python_env.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Click here to open the demo.

Notes

Format

Relevant links:

To do:

About

Releases

Packages

Languages

License

haibalabs/face-mesh-to-blendshapes

Folders and files

Latest commit

History

Repository files navigation

Click here to open the demo.

Notes

Format

Relevant links:

To do:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages