Question about the CER of analysis-synthesis and TTS system #6

Liujingxiu23 · 2024-10-17T12:09:47Z

In the paper, as shownd in the Figure 5 and claimed in part 6.3, I did not understand, why cer of LLM-based TTS system is more lower than the cers of audio from analysis-synthesis? No matter PQ, RQ or OPQ. It seems not in line with intuition

hhguo · 2024-12-20T03:38:19Z

This is because they are from different test sets. The analysis-synthesis test set presents higher diversity in noise level, recording environment, timbre, and speaking style, etc..., making it more challenging for perfect reconstruction. The TTS test set uses relatively clean audio as references.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the CER of analysis-synthesis and TTS system #6

Question about the CER of analysis-synthesis and TTS system #6

Liujingxiu23 commented Oct 17, 2024 •

edited

Loading

hhguo commented Dec 20, 2024

Question about the CER of analysis-synthesis and TTS system #6

Question about the CER of analysis-synthesis and TTS system #6

Comments

Liujingxiu23 commented Oct 17, 2024 • edited Loading

hhguo commented Dec 20, 2024

Liujingxiu23 commented Oct 17, 2024 •

edited

Loading