Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better How to guide? Step by step readme or something. #13

Open
Lustgard opened this issue Jun 26, 2023 · 12 comments
Open

Better How to guide? Step by step readme or something. #13

Lustgard opened this issue Jun 26, 2023 · 12 comments

Comments

@Lustgard
Copy link

I have installed everything on a fresh Linux Mint latest version as of today.
All the requirements are met, but it stops there, it seems firstly like there is an issue with:
"UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.24.3)

Cannot even get to the part where it generates the head, let alone, where do I feed it the image I want to create a head from?

@hack-mans
Copy link

@Lustgard Currently we're missing the camera pose estimation stage which means you can't use any random input image. What I did for testing was to use Stable Diffusion + ControlNet to generate new people in the exact same pose as the example dataset images, so I could keep the camera values from the existing dataset.json. Someone wrote up the instructions on my Reddit post about it: https://www.reddit.com/r/StableDiffusion/comments/14h0pf4/comment/jpf5qea/

If you want to generate the 3D PLY mesh as well as the video, you can check the required changes and dependencies on my forked repo:
https://github.com/hack-mans/PanoHead

@Lustgard
Copy link
Author

Thanks! I was able to solve the wrong NumPy version with the help of ChatGPT.
For others who might need help:
"conda install numpy=1.22.3"
Simple yes, but I'm no linux master yet, so baby steps atm.

@Lustgard
Copy link
Author

@hack-mans That forked repo is very helpful. Thank you for that.

@OverwriteDev
Copy link

I wonder if it's possible to infer a camera pose if we know the orientation of the face, using MediaPipe's face mesh pose estimation as an example.

https://developers.google.com/mediapipe/solutions/vision/face_landmarker/python

@hack-mans
Copy link

@OverwriteDev I've almost got it working using the EG3D + Deep3DFaceRecon code but it's slightly off

@ChikaYan
Copy link

@OverwriteDev I've almost got it working using the EG3D + Deep3DFaceRecon code but it's slightly off

I have actually tried using Deep3DFaceRecon to estimate the pose, however, the result is a lot worse than what you can get by using the given pose (see below). I wonder if the authors could kindly provide a hint of what they used to estimate face poses?

given pose:
post
deep3d pose:
post

@SizheAn
Copy link
Owner

SizheAn commented Jun 26, 2023

@OverwriteDev I've almost got it working using the EG3D + Deep3DFaceRecon code but it's slightly off

I have actually tried using Deep3DFaceRecon to estimate the pose, however, the result is a lot worse than what you can get by using the given pose (see below). I wonder if the authors could kindly provide a hint of what they used to estimate face poses?

given pose: post deep3d pose: post

Short answer is we use different cropping, centering, and pose estimation script other than Deep3DFaceRecon. And since that was a company service, I won't be able to share it. In fact, I myself also don't have the access now. Now I'm trying to find alternatives to achieve similar results but cannot guarantee either... Will keep you guys posted.

@carlosedubarreto
Copy link

@SizheAn , do you know if the estimation ws done using a single image?
Or this is a case where multi-image is used , calculated the camera data and just one image was used?

I'm trying to figure out how it woudl be possible to get such precise data using only one photo.

@SizheAn
Copy link
Owner

SizheAn commented Jun 26, 2023

@SizheAn , do you know if the estimation ws done using a single image? Or this is a case where multi-image is used , calculated the camera data and just one image was used?

I'm trying to figure out how it woudl be possible to get such precise data using only one photo.

Only single image. Pretty accurate if you can detect facial landmarks in the image. Our method is a combination of company's service + 3DDFA_V2 (https://github.com/cleardusk/3DDFA_V2). You can check their examples.

@SizheAn
Copy link
Owner

SizheAn commented Jun 29, 2023

We update the scripts and example data for obtaining camera poses and cropping the images for PTI. See https://github.com/SizheAn/PanoHead/blob/main/3DDFA_V2_cropping/cropping_guide.md

@hack-mans
Copy link

@SizheAn Thank you so much for the updates, this is great!!!

@OverwriteDev
Copy link

This is fantastic @SizheAn , this was massively helpful.
Not only did I manage to get it up and running with the directions you provided but I'm also now more familiar with 3DDFA_V2 which is a pretty awesome project :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants