Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config file for Librispeech 960h #37

Open
Youyoun opened this issue Aug 6, 2019 · 2 comments
Open

Config file for Librispeech 960h #37

Youyoun opened this issue Aug 6, 2019 · 2 comments

Comments

@Youyoun
Copy link

Youyoun commented Aug 6, 2019

Hi,

Does anyone have a config file that works wonders for training the ASR model on librispeech 960h ?
I can't seem to get it to the ~4% WER promised by many research papers. My best so far is above 10%.
Clearly with the tools provided by this repository, there must be a way to reach that much WER.

@Edresson
Copy link

@Youyoun Were you able to improve your WER?

@Youyoun
Copy link
Author

Youyoun commented May 22, 2020

Hey,

It's been a while since I worked on speech to text. From memory the best WER I ever reached was around 7%, but I don't remember the exact model parameters. I based it off mainly on the SpecAugment paper.

What really helped take down that WER below 10% was bigger batches. For that to work I used gradient accumulation (I think that my batch size was 32 on 1 GPUs, and with accum grad I took it to 512). Pretty easy to implement.

It took me 2 weeks to train on single GPU.

Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants