Config file for Librispeech 960h #37

Youyoun · 2019-08-06T09:00:38Z

Hi,

Does anyone have a config file that works wonders for training the ASR model on librispeech 960h ?
I can't seem to get it to the ~4% WER promised by many research papers. My best so far is above 10%.
Clearly with the tools provided by this repository, there must be a way to reach that much WER.

Edresson · 2020-05-22T15:23:32Z

@Youyoun Were you able to improve your WER?

Youyoun · 2020-05-22T15:39:14Z

Hey,

It's been a while since I worked on speech to text. From memory the best WER I ever reached was around 7%, but I don't remember the exact model parameters. I based it off mainly on the SpecAugment paper.

What really helped take down that WER below 10% was bigger batches. For that to work I used gradient accumulation (I think that my batch size was 32 on 1 GPUs, and with accum grad I took it to 512). Pretty easy to implement.

It took me 2 weeks to train on single GPU.

Hope this helps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Config file for Librispeech 960h #37

Config file for Librispeech 960h #37

Youyoun commented Aug 6, 2019

Edresson commented May 22, 2020

Youyoun commented May 22, 2020 •

edited

Loading

Config file for Librispeech 960h #37

Config file for Librispeech 960h #37

Comments

Youyoun commented Aug 6, 2019

Edresson commented May 22, 2020

Youyoun commented May 22, 2020 • edited Loading

Youyoun commented May 22, 2020 •

edited

Loading