Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Pipeline + Steps for training TTS #26

Open
m-hamza-mughal opened this issue Sep 15, 2020 · 0 comments
Open

Training Pipeline + Steps for training TTS #26

m-hamza-mughal opened this issue Sep 15, 2020 · 0 comments

Comments

@m-hamza-mughal
Copy link

m-hamza-mughal commented Sep 15, 2020

Hi,
Thanks for this clean and great implementation for MelNet.
I'm a beginner in Speech Synthesis so kindly guide me through the steps for training MelNet for TTS:
What I know/assume:

  • Training will be done separately for tiers and for TTS, we'll use the tier flag set to 1 and tts flag set to True
  • For subsequent tiers, we will set tier flag to 2,3,4,5,6 respectively and tts flag to False.
  • Finally we will put checkpoints for each tier in inference.yaml and pass it to MelNet class for prediction.

Therefore I have some questions:

  • Can you provide/confirm the steps to train multiple tiers for the TTS option?

  • Are we supposed to train TTS (with --tts flag set to True) and keeping tier number = 1?

  • What do you mean by this in README.md:
    The -s flag is a boolean for determining whether to train a TTS tier. Since a TTS tier only differs at tier 1, this flag is ignored when [tier number] != 0 .

    • And where is this condition in the code which you referred here: [tier number] != 0
    • I assume this means we should ignore tts flag in case tier number > 2?
  • What is the difference between tts arg for trainer and tier number in config file (YAML) and should they be same? If not then what is the difference?

  • How do we know that our model (for each tier) has converged? What is the minimum train/test loss value we should achieve. What was your training time and on what GPU

  • Lastly, can we generate Mel outputs from different trained tier models? Like if we have TTS model + some consecutive tier and we can infer the output to check training performance.

@m-hamza-mughal m-hamza-mughal changed the title Training Pipeline Steps for training TTS] Training Pipeline + Steps for training TTS Sep 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant