[tts] long utts: constituicao might not be ready to go #2

cassiotbatista · 2023-04-23T14:06:22Z

There's a considerable mismatch w.r.t. dataset's characteristics between Constituicao and LJSpeech. Audios of the former are longer (20s-40s) while the latter's do not usually go beyond 10s, and I'm not sure whether this fact plays nice with FastSpeech 2's recipe. AAMOF ESPnet's TTS recipe ignores audios longer than 20s by default.

A possible way to go would be re-segment Constituicao to make individual utts shorter. MFA's has been finding SILs in the middle of sentences quite often - in fact the speaker pauses in between titles and end of sentences. A VAD and an FA would be of great help with that.

plot_scripts.zip

cassiotbatista changed the title ~~long utts: constituicao might not be ready to go~~ [tts] long utts: constituicao might not be ready to go Apr 23, 2023

cassiotbatista added the question Further information is requested label Apr 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tts] long utts: constituicao might not be ready to go #2

[tts] long utts: constituicao might not be ready to go #2

cassiotbatista commented Apr 23, 2023 •

edited

Loading

[tts] long utts: constituicao might not be ready to go #2

[tts] long utts: constituicao might not be ready to go #2

Comments

cassiotbatista commented Apr 23, 2023 • edited Loading

cassiotbatista commented Apr 23, 2023 •

edited

Loading