You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Does anyone have a config file that works wonders for training the ASR model on librispeech 960h ?
I can't seem to get it to the ~4% WER promised by many research papers. My best so far is above 10%.
Clearly with the tools provided by this repository, there must be a way to reach that much WER.
The text was updated successfully, but these errors were encountered:
It's been a while since I worked on speech to text. From memory the best WER I ever reached was around 7%, but I don't remember the exact model parameters. I based it off mainly on the SpecAugment paper.
What really helped take down that WER below 10% was bigger batches. For that to work I used gradient accumulation (I think that my batch size was 32 on 1 GPUs, and with accum grad I took it to 512). Pretty easy to implement.
Hi,
Does anyone have a config file that works wonders for training the ASR model on librispeech 960h ?
I can't seem to get it to the ~4% WER promised by many research papers. My best so far is above 10%.
Clearly with the tools provided by this repository, there must be a way to reach that much WER.
The text was updated successfully, but these errors were encountered: