-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
音频文件规范问题 #304
Comments
很明显,这是音频文件的时间长度过长导致的,可以参考ASRT项目文档上所述的内容,一条语音数据的最长时间长度当前限制为不能超过16秒,超过的话很容易导致模型的数据尺寸过大进而引发Memory不足的问题,尤其是在使用不太先进的GPU运行的时候。如果存在较长时间的音频,首先应当切割为一段段比较短的音频片段。 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
作者您好,我在运行您给出的可以直接使用的代码也就是给出了预训练模型的项目时,使用我自己的音频进行预测时,出现了以下错误,
我怀疑时音频的长度问题,我之前已经通过ffmpeg进行了预处理,如下:
采样率应该没有问题,不知道是不是音频长度问题,如果是的话,可以麻烦作者告知一下怎么样可以规范化一下输入的音频,我尝试更改numpy数组的长度也不行,还希望大佬指教一下。非常感谢!
The text was updated successfully, but these errors were encountered: