Skip to content

Latest commit

 

History

History
36 lines (27 loc) · 1.02 KB

README.md

File metadata and controls

36 lines (27 loc) · 1.02 KB

TPGST reimplementation with pytorch

Prerequisite

  • python 3.7
  • pytorch 1.3
  • librosa, scipy, tqdm, tensorboardX

Dataset

  • KSS, Korean female single speaker speech dataset.

Samples

Usage

  1. Download the above dataset and modify the path in config.py. And then run the below command.

    python prepro.py
    
  2. The model needs to train 100k+ steps

    python train.py <gpu_id>
    
  3. After training, you can synthesize some speech from text.

    python synthesize.py <gpu_id> <model_path>
    
  4. To listen your samples, you may need mel2wav vocoder. I didn't include vocoder in this repo.

Notes

  • I think the difference between baseline Tacotron and TPGST is small on KSS dataset.
  • I will be doing more experiminets soon.