Replicating WER on FLEURS #2076
michailmelonas
started this conversation in
General
Replies: 1 comment 1 reply
-
Hi! I'm the author of the referenced post. I think the python API doesn't set |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm having trouble replicating the stated WER given in Table 13. In particular, on the Afrikaans test split of the FLEURS dataset, I get a WER of 110.59% when using the following snippet with the Tiny model:
This is much higher than the value of 91.2% given in the paper.
I also looked using the
whisper.transcribe.transcribe
function (which uses various decoding strategies), but this gave a WER of 99.98% (which is still higher).I'd appreciate any thoughts on what explains this difference. I see a similar point was raised in #702, but no answer has yet been provided.
Update: when using the
whisper.transcribe.transcribe
I'm finding different results when doing multiple runs. Also, specifying the target language seems to improve the WER when using this function.Beta Was this translation helpful? Give feedback.
All reactions