StarGAN-Voice-Conversion

This is a pytorch implementation of the paper: StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks and Non-parallel Voice Conversion using Weighted Generative Adversarial Networks.

**The converted voice examples are in converted directory.

**VCTK database has been used to train the model with 70 speakers. The convereted samples are a bit noisy because of VCTK data but it can be improved if other clean databases are used.

[Dependencies]

Python 3.5+
pytorch 0.4.0+
librosa
pyworld
tensorboardX
scikit-learn
tqdm

[Usage]

Download dataset

Download and unzip VCTK corpus to designated directories.

mkdir ./data
wget https://datashare.is.ed.ac.uk/bitstream/handle/10283/2651/VCTK-Corpus.zip?sequence=2&isAllowed=y
unzip VCTK-Corpus.zip -d ./data

If the downloaded VCTK is in tar.gz, run this:

tar -xzvf VCTK-Corpus.tar.gz -C ./data

The data directory now looks like this:

data
├── vctk
│   ├── p225
│   ├── p226
│   ├── ...
│   └── p360

Preprocess

Extract features (mcep, f0, ap) from each speech clip. The features are stored as npy files. We also calculate the statistical characteristics for each speaker.

python preprocess.py

This process may take minutes !

The data directory now looks like this:

data
├── vctk (48kHz data)
│   ├── p225
│   ├── p226
│   ├── ...
│   └── p360 
├── vctk_16 (16kHz data)
│   ├── p225
│   ├── p226
│   ├── ...
│   └── p360
├── mc
│   ├── train
│   ├── test

Train

python main.py

Convert

convert.py --src_spk p262 --trg_spk p272 --resume_iters 210000

[Network structure]

Note:

This implementation follows the original StarGAN-VC paper’s network structure, while StarGAN-VC code use StarGAN's network architecture.
Our converted sound qualities are better than StarGAN-VC code which uses original StarGAN-VC paper’s network structure.

[Reference]

[Acknowlegements]

StarGAN-VC code (Original Network Architecture)

StarGAN-VC code

StarGAN code

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
converted/200000		converted/200000
README.md		README.md
convert.py		convert.py
data_loader.py		data_loader.py
logger.py		logger.py
main.py		main.py
model.py		model.py
preprocess.py		preprocess.py
solver.py		solver.py
solver_cumulant.py		solver_cumulant.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StarGAN-Voice-Conversion

[Dependencies]

[Usage]

Download dataset

Preprocess

Train

Convert

[Network structure]

[Reference]

[Acknowlegements]

About

Releases

Packages

Languages

dipjyoti92/StarGAN-Voice-Conversion

Folders and files

Latest commit

History

Repository files navigation

StarGAN-Voice-Conversion

[Dependencies]

[Usage]

Download dataset

Preprocess

Train

Convert

[Network structure]

[Reference]

[Acknowlegements]

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages