Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is my duplicated wavLM results on vox1-o is 30% worse #28

Closed
AIDman opened this issue Jun 18, 2022 · 3 comments
Closed

Why is my duplicated wavLM results on vox1-o is 30% worse #28

AIDman opened this issue Jun 18, 2022 · 3 comments

Comments

@AIDman
Copy link

AIDman commented Jun 18, 2022

model EER(mine) EER(official)
wavlm_large_nofinetune.pth 0.965 0.75
wavlm_large_finetune.pth 0.631 0.431

The above results are the validation results of your shared wav_lm models on the original Vox1-o data without changing any code.
What might be the reason for this gap? Wrong settings?
Here is more background about my setting:

  1. Create a conda env as:
conda create -n UniSpeech_py3p8 python=3.8
  1. Following your guidance under https://github.com/microsoft/UniSpeech/tree/main/downstreams/speaker_verification
pip install --require-hashes -r requirements.txt 

The following error will appear:

Collecting numpy<1.23.0,>=1.16.5
ERROR: In --require-hashes mode, all requirements must have their versions pinned with ==. These do not:
    numpy<1.23.0,>=1.16.5 from https://files.pythonhosted.org/packages/2f/14/abc14a3f3663739e5d3c8fd980201d10788d75fea5b0685734227052c4f0/numpy-1.22.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=64f56fc53a2d18b1924abd15745e30d82a5782b2cab3429aceecc6875bd5add0 (from scipy==1.7.1->-r requirements.txt (line 1))

Then I installed the environment manually (installed around 30~40 tools) just as #26

  1. Here is some related details:
    pip list | grep fairseq
    fairseq 0.12.1 /home/user1/tools/fairseq
    pip list | grep s3prl
    s3prl 0.3.1
    torch.version: 1.9.0+cu102
    python -V: 3.8.13

Thanks for your wonderful work and looking forward for your help.

@YuzaChongyi
Copy link

YuzaChongyi commented Jun 30, 2022

In my experiment, the wavlm_large_finetune EER is 0.574.

@Sanyuan-Chen
Copy link
Contributor

Hi @AIDman ,

As for the environment error, could you replace this line

self.feature_extract = torch.hub.load('s3prl/s3prl', feat_type)

with self.feature_extract = torch.hub.load('s3prl/s3prl:e52439edaeb1a443e82960e6401ae6ab4241def6', feat_type) and try again? The fairseq library is not necessary for inference WavLM model. As for the older version of s3prl, it can automatically skip the Import Error from fairseq, but the latest version of s3prl code would accidentally raise an ImportError.

As for the fine-tuning results for speaker verification, we use the adaptive snorm to normalize the trial scores and further apply the quality-aware score calibration as introduced in Section V.C-3 of our WavLM paper.

@WhXmURandom
Copy link

Hi @AIDman ,

As for the environment error, could you replace this line

self.feature_extract = torch.hub.load('s3prl/s3prl', feat_type)

with self.feature_extract = torch.hub.load('s3prl/s3prl:e52439edaeb1a443e82960e6401ae6ab4241def6', feat_type) and try again? The fairseq library is not necessary for inference WavLM model. As for the older version of s3prl, it can automatically skip the Import Error from fairseq, but the latest version of s3prl code would accidentally raise an ImportError.
As for the fine-tuning results for speaker verification, we use the adaptive snorm to normalize the trial scores and further apply the quality-aware score calibration as introduced in Section V.C-3 of our WavLM paper.

Can you provide the code for the quality-aware score calibration?Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants