You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey everyone!
I have a short question (maybe not so short, haha) about fine-tuning Whisper models on own labelled data. Following the general jupyter notebook on Huggingface, there seems to be no preprocessing steps for transcriptions that are generated during the training and are used for evaluations over the course of training. Is it necessary to add this step in the training somehow?
What I mean:
For example, for my own dataset that I am using for fine-tuning, all letters are lowercase, and there are no punctuation marks, only pure letter characters and whitespaces. Now, after starting fine-tuning, for example, every 200 steps the current model gets evaluated; however, is it not possible that model will generate output that will have uppercase and punctuation characters, and therefore WER will be higher than if they were preprocessed after being generated?
Opinions and suggestions would be very helpful, thank you very much!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hey everyone!
I have a short question (maybe not so short, haha) about fine-tuning Whisper models on own labelled data. Following the general jupyter notebook on Huggingface, there seems to be no preprocessing steps for transcriptions that are generated during the training and are used for evaluations over the course of training. Is it necessary to add this step in the training somehow?
What I mean:
For example, for my own dataset that I am using for fine-tuning, all letters are lowercase, and there are no punctuation marks, only pure letter characters and whitespaces. Now, after starting fine-tuning, for example, every 200 steps the current model gets evaluated; however, is it not possible that model will generate output that will have uppercase and punctuation characters, and therefore WER will be higher than if they were preprocessed after being generated?
Opinions and suggestions would be very helpful, thank you very much!
Beta Was this translation helpful? Give feedback.
All reactions