Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speaker verification result #46

Open
pierfale opened this issue Jul 25, 2023 · 4 comments
Open

Speaker verification result #46

pierfale opened this issue Jul 25, 2023 · 4 comments

Comments

@pierfale
Copy link

pierfale commented Jul 25, 2023

Hello,

Thank you for your work on WavLM.
I try to reproduce the results but I have some difficulties.

First of all, I don't undestand exactly the difference between scores displayed in different places. For instance, on Vox1-O:

Moreover I tried to reproduce result from the fine-tuned checkpoint available on this repository (https://drive.google.com/file/d/1-aE1NfzpRCLxA4GUxX9ITI3F9LlbtEGP/view?usp=sharing).

I get the following result on vox1-O:

  • Without normalisation, I get EER = 0.558%
  • With s-norm, I get EER = 0.542%
  • with as-norm (cohort size = 600), I get EER = 0.505%

Do you have any more details to provide?

Thank you

@gozsoy
Copy link

gozsoy commented Feb 13, 2024

I can confirm that I obtained EER 0.558% for Vox1-O using WavLM large finetuned.

@gancx
Copy link

gancx commented Apr 25, 2024

Hello,

Thank you for your work on WavLM. I try to reproduce the results but I have some difficulties.

First of all, I don't undestand exactly the difference between scores displayed in different places. For instance, on Vox1-O:

Moreover I tried to reproduce result from the fine-tuned checkpoint available on this repository (https://drive.google.com/file/d/1-aE1NfzpRCLxA4GUxX9ITI3F9LlbtEGP/view?usp=sharing).

I get the following result on vox1-O:

  • Without normalisation, I get EER = 0.558%
  • With s-norm, I get EER = 0.542%
  • with as-norm (cohort size = 600), I get EER = 0.505%

Do you have any more details to provide?

Thank you

I also observed these differences. Have you fixed it?

@RegulusBai
Copy link

Same 0.558% and waiting for reply

@tcourat
Copy link

tcourat commented Sep 16, 2024

I have the same question.

I did not test myself, but according to the original WavLM paper :

In the evaluation stage, the whole utterance is fed into the
system to extract speaker embedding. We use cosine similarity
to score the evaluation trial list. We also use the adaptive snorm [59], [60] to normalize the trial scores. The imposter
cohort is estimated from the VoxCeleb2 dev set by speakerwise averaging all the extracted speaker embeddings. We set
the imposter cohort size to 600 in our experiment. To further
push the performance, we also introduce the quality-aware
score calibration [58] for our best systems, where we randomly
generate 30k trials based on the VoxCeleb2 test set to train
the calibration model.

Maybe the results are reported by using their calibration model, but this calibration model was not shared. WIthout this quality aware score calibration, the EER on Vox1-O goes down from 0.383% to 0.617% , which may explain the gap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants