Inference #6

codeghees · 2021-07-07T08:14:35Z

Hi! Great work with this. Was able to reproduce your results I think.
@qute012
Two questions - what is the best way to run inference on the trained model? Any sample you have?
Secondly, I was getting an error on fine-tuning a model trained on Google speech commands to my Urdu dataset.
cfg = convert_namespace_to_omegaconf(state_dict['args'])
Error was a key error 'args' not found. What am I doing wrong?
Was passing the .pth model. I checked the model was being loaded.

Any help would be appreciated.

The text was updated successfully, but these errors were encountered:

codeghees · 2021-07-07T08:30:44Z

The test accuracy for 10 samples for each keyword is over 94 percent. Sounds too good to be true.

BeardyMan37 · 2021-07-08T11:23:41Z

Hello, @codeghees. Could you please provide me with a requirement.txt file or a conda environment.yml file for the environment you used while reproducing the results? I tried to reproduce the results on the google speech v2 dataset and was faced with the same errors.

dobby-seo · 2021-07-08T11:54:08Z

Hi~ @codeghees @BeardyMan37

Thank for concerning this project. Truly, i can't afford to maintain this project and can't access server now also ; (
If i have time, i would prefer to develop this project for inferencing. But you guys can reproduce this project referring hyperparameters and model architecture.

Sorry 😐

codeghees · 2021-07-08T12:00:11Z

can you point me to a direction for inference?

codeghees · 2021-07-08T12:00:36Z

I can build it myself.

@BeardyMan37 I used Google Colab.

dobby-seo · 2021-07-08T12:23:00Z

@codeghees

extract loudest section
Most important for accuracy, because this model can get only 1 sec raw audio file. So you have to check out extracted signal contains voice actually.

def extract_loudest_section(self, wav, win_len=30):
        wav_len = len(wav)
        temp = abs(wav)

        st,et = 0,0
        max_dec = 0

        for ws in range(0, wav_len, win_len):
            cur_dec = temp[ws:ws+16000].sum()
            if cur_dec >= max_dec:
                max_dec = cur_dec
                st,et = ws, ws+16000
            if ws+16000 > wav_len:
                break

        return wav[st:et]

post process (in fairseq)
You don't need to normalize raw audio. And i think it works nothing, i just add it for Wav2Vec 2.0 pipeline. I'm not sure, but it doesn't matter to remove this function.

 def postprocess(self, feats, curr_sample_rate):
        if feats.dim() == 2:
            feats = feats.mean(-1)

        if curr_sample_rate != self.sample_rate:
            raise Exception(f"sample rate: {curr_sample_rate}, need {self.sample_rate}")

        assert feats.dim() == 1, feats.dim()

        if self.normalize:
            with torch.no_grad():
                feats = F.layer_norm(feats, feats.shape)
        return feats

make single batch to feed to model.
predict class from argmax of model output

codeghees · 2021-07-08T12:28:34Z

Also how do we get which index represents which class i.e 0 for "UP" - is that positioning of the item in the index array?

dobby-seo · 2021-07-08T12:29:38Z

@codeghees

Yes, right! like simple classification other method :D

codeghees · 2021-07-08T12:30:21Z

Oh I meant - how do we know the mapping. Does that come from the CLASSES array?

Thanks!

dobby-seo · 2021-07-08T12:30:40Z

Yes. If you can produce training environment, can you PR for others?

codeghees · 2021-07-08T12:31:43Z

I will go back and check - I just opened colab and followed the instructions. - What is the exact error @BeardyMan37?

BeardyMan37 · 2021-07-08T12:48:04Z

Managed to resolve it. @codeghees

BeardyMan37 · 2021-07-08T12:55:30Z

@qute012 attaching both the requirement.txt and the environment.yml file for your reference.

alirezafarashah · 2022-03-02T09:03:12Z

Hi! Great work with this. Was able to reproduce your results I think. @qute012 Two questions - what is the best way to run inference on the trained model? Any sample you have? Secondly, I was getting an error on fine-tuning a model trained on Google speech commands to my Urdu dataset. cfg = convert_namespace_to_omegaconf(state_dict['args']) Error was a key error 'args' not found. What am I doing wrong? Was passing the .pth model. I checked the model was being loaded.

Any help would be appreciated.

hello @codeghees . I encountered the same error while trying to finetune a huggingface wav2vec model with fairseq. Have you found out a method to convert a huggingface model(.bin) to fairseq checkpoint(.pt)?

salmaShahid · 2023-04-16T18:07:07Z

@codeghees can you please guide me or give me the link of your colab file? I want to reproduce this result and apply same strategy on Urdu Language.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference #6

Inference #6

codeghees commented Jul 7, 2021

codeghees commented Jul 7, 2021

BeardyMan37 commented Jul 8, 2021

dobby-seo commented Jul 8, 2021

codeghees commented Jul 8, 2021

codeghees commented Jul 8, 2021

dobby-seo commented Jul 8, 2021 •

edited

Loading

codeghees commented Jul 8, 2021

dobby-seo commented Jul 8, 2021

codeghees commented Jul 8, 2021

dobby-seo commented Jul 8, 2021 •

edited

Loading

codeghees commented Jul 8, 2021

BeardyMan37 commented Jul 8, 2021

BeardyMan37 commented Jul 8, 2021 •

edited

Loading

alirezafarashah commented Mar 2, 2022

salmaShahid commented Apr 16, 2023

Inference #6

Inference #6

Comments

codeghees commented Jul 7, 2021

codeghees commented Jul 7, 2021

BeardyMan37 commented Jul 8, 2021

dobby-seo commented Jul 8, 2021

codeghees commented Jul 8, 2021

codeghees commented Jul 8, 2021

dobby-seo commented Jul 8, 2021 • edited Loading

codeghees commented Jul 8, 2021

dobby-seo commented Jul 8, 2021

codeghees commented Jul 8, 2021

dobby-seo commented Jul 8, 2021 • edited Loading

codeghees commented Jul 8, 2021

BeardyMan37 commented Jul 8, 2021

BeardyMan37 commented Jul 8, 2021 • edited Loading

alirezafarashah commented Mar 2, 2022

salmaShahid commented Apr 16, 2023

dobby-seo commented Jul 8, 2021 •

edited

Loading

dobby-seo commented Jul 8, 2021 •

edited

Loading

BeardyMan37 commented Jul 8, 2021 •

edited

Loading