-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transformer unable to predict double phonemes #20
Comments
Hi, I've updated the first post with reproduction code and new info. Hope it will get addressed. The most likely reason I thought to be the loss but I wasn't able to confirm it. |
Hi, thanks for mentioning the problem. I actually have seen that issue in the past but didn't have time to address it. The problem is not overfitting but the decoding the ctc output to phonemes, where a deduplication happens. One would need to replace the deduplication with a better method that allows phoneme duplicates. In case you are interested in messing with the code, here is the function: DeepPhonemizer/dp/model/utils.py Line 38 in b8f1707
|
Got it. Any particular reason for not using cross-entropy in the transformer too? |
Do you mean cross-attention? |
Hello.
I found out a bug where the transformer model is unable to learn sequences of two or more consecutive identical phonemes. I first discovered it for italian which has double consonants and then applied it to english as well. Take the words holy and wholly as example. According to WordReference, their RP (probably outdated) pronunciation should be respectively: həʊli and həʊlli. I don't know how common is the latter with a geminated l sound but it doesn't really matter. What matters is that even with char repeats equal to 3 or 5 the transformer is unable to predict double phonemes.
It can be easily reproduced by running the run_training.py debug script with the default yaml file and this data:
Even in a super overfitting environment you will see that predictions will be always həʊli. Reproduction rate 100%.
The text was updated successfully, but these errors were encountered: