You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, the inference process requires a Token.yaml file to obtain the vocabulary. According to your instructions, maybe I can generate it by running Pattern_Generate.py, but before that, I have to download the two datasets and then generate the pattern by giving the path of LJ and VCTK, right?
Besides, I'm wondering would it be possible for you to share the Token.yaml file? This would greatly help me and other users use the pre-trained model for inference tasks.
Also, could you please share the speaker mapping of the speaker id in LUT and the speaker_name (eg. p225) in VCTK?
Thank you in advance for your assistance, and I appreciate any help you can provide.
The text was updated successfully, but these errors were encountered:
It is correct that you need to specify the paths of LJ and VCTK and generate patterns. Originally, for simple inference, a checkpoint file should have been constructed without this step, but I forgot to upload the yaml file. I am so sorry.
Since the implementation of the model was a long time ago, the preprocessed dataset containing related information seems to have been lost. So, I will re-make the Token.yaml file through the code and upload: Token.zip
As for the yaml file for Speaker ID, I guess that each speaker, including LJ, was written in ascending order. Even if it is regenerated, it is not clear if the numbers are matched correctly, so sharing it again would not be very helpful. Sorry.
If you have any further requests or questions, please feel free to let me know. Thank you.
Thank you for your detailed reply! That really helps me a lot!
I've got over this issue, but ISSUE #9 is still unsolved.
Actually, I tried to train the PWGAN with the hyperparameters from the Hyper_Parameters.yaml which is found in the pre-trained ckpt as the following:
The PWGAN itself works well, but still cannot work for the output mel-spectrograms of Glow_TTS, so I'm really confused about what happened. Not sure if the hyperparameters of sound are still not matched.
I will close this issue and be really appreciate it if you can give me some hints about this vocoder.
Thank you for sharing this excellent repository with the community.
I have been trying to use the pre-trained model (the SE & LUT one trained on LJ and VCTK, https://drive.google.com/file/d/114z-cSEJHs8DdnIKnEE8pthIME6FprSM/view?usp=sharing) and inference code provided in the repository to synthesize speech from given text with the speaker in VCTK.
However, the inference process requires a
Token.yaml
file to obtain the vocabulary. According to your instructions, maybe I can generate it by runningPattern_Generate.py
, but before that, I have to download the two datasets and then generate the pattern by giving the path of LJ and VCTK, right?Besides, I'm wondering would it be possible for you to share the
Token.yaml
file? This would greatly help me and other users use the pre-trained model for inference tasks.Also, could you please share the speaker mapping of the speaker id in LUT and the speaker_name (eg. p225) in VCTK?
Thank you in advance for your assistance, and I appreciate any help you can provide.
The text was updated successfully, but these errors were encountered: