the ground truth #3

cocoaw · 2021-11-08T08:40:48Z

Hi,sorry to interrupt! how did you get the ground truth to train the second part?I mean that the style code is extracted from other videos,there shouldn't have the ground truth about the output video which identity from the static photo while the style from another video, so I am a little confused about the ground truth Y in the second loss.Can you explain it?Thank you!

wuhaozhe · 2021-11-08T08:48:42Z

Hi, during the training phase, we only use the style code, the audio and the video from the same footage. During inference, we can feed style codes from other footages.
That's effective because we feed style code into the middle layer of Resnet, which enforce Resnet to incorporate more style information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the ground truth #3

the ground truth #3

cocoaw commented Nov 8, 2021

wuhaozhe commented Nov 8, 2021

the ground truth #3

the ground truth #3

Comments

cocoaw commented Nov 8, 2021

wuhaozhe commented Nov 8, 2021