From 5b6b3578c081d22c1a0737d4fdd267f94d018ce9 Mon Sep 17 00:00:00 2001 From: James Ritts Date: Tue, 8 Aug 2023 07:37:11 -0700 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index d490b6e..e99bedb 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ This notebook (mediapipe_face_mesh_to_blendshapes.ipynb) trains a simple pytorch Caveats - -- We wish to train on object space geo so it doesn't have to learn what a face pose looks like in every possible head orientation. Unfortunately MediaPipe's output is only given in [screen coordinates](https://www.cse.iitd.ac.in/~suban/vision/affine/node5.html). Its mesh is also stretched to conform to the silhouette of the face in the input image. The function normalize_landmarks() tries to undo these effects: the mesh is segmented into mouth, left- and right-eye patches; then a basis built for each path from selected landmarks to transform the mesh from world to view space; then projected to the XY plane and components rescaled to [0, 1] for model input. +- We wish to train on object space geo so it doesn't have to learn what a face pose looks like in every possible head orientation. Unfortunately MediaPipe's output is only given in [screen coordinates](https://www.cse.iitd.ac.in/~suban/vision/affine/node5.html). Its mesh is also stretched to conform to the silhouette of the face in the input image. The function normalize_landmarks() tries to undo these effects: the mesh is segmented into mouth, left- and right-eye patches; then a basis built for each patch from selected quads in order to transform the mesh from world to view space; then they're projected to the XY plane and components rescaled to [0, 1] for model input. - The function convert_landmarks_to_model_input() uses normalize_landmarks in order to convert from raw MediaPipe output to the NN input vector. This function needs to be ported to any environment where the model is run. - MediaPipe isn't able to signal every blendshape. These should be forced to zero at runtime and possibly others as well: jawForward, jawRight, jawLeft, mouthDimpleRight, mouthDimpleLeft, cheekPuff, tongueOut.