Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tts] long utts: constituicao might not be ready to go #2

Open
cassiotbatista opened this issue Apr 23, 2023 · 0 comments
Open

[tts] long utts: constituicao might not be ready to go #2

cassiotbatista opened this issue Apr 23, 2023 · 0 comments
Labels
question Further information is requested

Comments

@cassiotbatista
Copy link
Member

cassiotbatista commented Apr 23, 2023

There's a considerable mismatch w.r.t. dataset's characteristics between Constituicao and LJSpeech. Audios of the former are longer (20s-40s) while the latter's do not usually go beyond 10s, and I'm not sure whether this fact plays nice with FastSpeech 2's recipe. AAMOF ESPnet's TTS recipe ignores audios longer than 20s by default.

A possible way to go would be re-segment Constituicao to make individual utts shorter. MFA's has been finding SILs in the middle of sentences quite often - in fact the speaker pauses in between titles and end of sentences. A VAD and an FA would be of great help with that.

plot_scripts.zip

@cassiotbatista cassiotbatista changed the title long utts: constituicao might not be ready to go [tts] long utts: constituicao might not be ready to go Apr 23, 2023
@cassiotbatista cassiotbatista added the question Further information is requested label Apr 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant