Lip sync error #1

omcar17 · 2019-02-21T07:14:33Z

Hello,
Thank you for the excellent work and publicly available code.
I am using syncnet to find if there is lip-sync error in the video. I am getting very random values of AV offset and confidence, after using the train weights available on official website.

I am confused about this paragraph from the paper -

Determining the lip-sync error -
To find the time offset between the audio and the video, we take a sliding-window
approach. For each sample, the distance is computed between one 5-frame video
feature and all audio features in the ± 1 second range. The correct offset is when
this distance is at a minimum. However as Table 2 suggests, not all samples in
a clip are discriminative (for example, there may be samples in which nothing
is being said at that particular time), therefore multiple samples are taken for
each clip, and then averaged.

I am missing something in this paragraph. How do I collect multiple samples for each clip?
I would like to know how to get a proper value of metric (AV offset, Confidence) that show the out of sync of video and audio on sample.

Thank you

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lip sync error #1

Lip sync error #1

omcar17 commented Feb 21, 2019

Lip sync error #1

Lip sync error #1

Comments

omcar17 commented Feb 21, 2019