-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
转换后音色跟着 source 而不是 target #97
Comments
我也是,而且我还用了很多数据 |
我数据也有七八十小时。你找到原因了吗?我查了很久,不知道问题出在哪儿 |
我的数据有几千个小时都不行,还在找 |
好的,你要是找到原因了方便告知一下吗?非常感谢! |
你看下你的mel-loss是多少,有没有下降 |
我现在在做实验,你的数据是否每个人的声音数目差不多嘛?还是说有些人数据很多 |
这个我之前倒是没统计。统计出来如下: utterances 数范围:139-506 |
你试试每个speaker在数目差不多呢 |
嗯嗯,等有时间的吧,现在还需要忙其它事情。 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
您好,我将 ssl model 更换为中文版 wav2vec2 和 hubert,然后进行了 retrain 和 fine-tune,但不管哪种方式,转换出来的结果都是音色和 source 相似而不是 target。
请问可能的原因是什么,我应该怎么解决这个问题?
The text was updated successfully, but these errors were encountered: