Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results Using OpenCV DNN Module #19

Open
ghost opened this issue Nov 11, 2019 · 6 comments
Open

Results Using OpenCV DNN Module #19

ghost opened this issue Nov 11, 2019 · 6 comments

Comments

@ghost
Copy link

ghost commented Nov 11, 2019

I had a hard time getting Caffe installed, so I figured I'd try out your model using openCV's dnn.readNetFromCaffe() along with your .caffemodel and .prototxt.

The output of net.forward() when using this method is a 1x21x300x300 matrix which can be squeezed to 21x300x300. Each of the 21 300x300 arrays, when normalized, seem to constitute a type of heat map. Some of these can be seen below.

image

My question is, how would I combine these to get the actual face segmentation? I tried to parse your code to see if I could figure it out but fell short of understanding. Thanks!

@YuvalNirkin
Copy link
Owner

YuvalNirkin commented Nov 11, 2019

The 300x300 model has only two output channels.

@ghost
Copy link
Author

ghost commented Nov 11, 2019

Does that mean the output should only be a 2x300x300 and not a 21x300x300? Then, in the following block of code from face_seg.py, are you simply taking the maximum of the two channels?

# run net and take argmax for prediction
net.forward()
out = net.blobs['score'].data[0].argmax(axis=0)

The end result must be a binary mask, but all of the outputs I'm seeing in the 21x300x300 array I get are float values, many of which are even negative. I'm wondering if the output of net.forward() may be different in the actual Caffe library and in openCV's readFromCaffe. I'm not sure though...

For reference, here is my code:

import numpy as np
import cv2

image = cv2.imread('Alison_Lohman_0001.jpg')

# Define prototext and caffemodel paths, and create model
caffeModel = "face_seg_fcn8s.caffemodel"
prototextPath = "face_seg_fcn8s_deploy.prototxt"
net = cv2.dnn.readNetFromCaffe(prototextPath,caffeModel)

# Resize to 300x300
image = cv2.resize(image,(300,300))
# blobImage convert RGB (104.00698793,116.66876762,122.67891434)
blob = cv2.dnn.blobFromImage(image,1.0,(300,300),(104.00698793,116.66876762,122.67891434))

# Passing blob through network
net.setInput(blob)
output = net.forward()

output is a 1x21x300x300 float32

@ghost
Copy link
Author

ghost commented Nov 15, 2019

Ah I see what you mean about the 300x300 model. It produces a 1x2x300x300 output.

I'm still unsure of what to do with the two resulting images. After normalization, they look like this:
image
for the Alison_Lohman_0001.jpg image

@kiralygomba
Copy link

Did you find the answer to this? I'm trying to use it in openCV as well, same problems.

@ghost
Copy link
Author

ghost commented Jun 30, 2020

@kiralygomba no unfortunately not.

@charleswg
Copy link

@kiralygomba looks like the only thing missing is the last operation:
mask=output[0].argmax(axis=0)
mask=1*(mask>0)

Though I must say it doesn't quite work as I thought

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants