Voice Conversion by using CycleGAN

MLSP Undergraduate Term Project, ITU
Project Page · Report · Presentation

Changelog

As referenced above, we highly utilized from @leimao 's work while constructing this project. Some updates are required to reduce the time-consuming process as a main reason. The training for one model took approximately 5.5 hours after these updates with two NVIDIA Tesla T4 GPUs. Two models trained for female-to-female and male-to-female voice conversion. Here is the detailed updates:

TRAINING

Training hyper-parameters, audio processing and default path variables are separated from related scripts; gathered together into 'hyparams.py'
'decay_threshold' parameter is added for monitoring the learning rate reduction over iterations.
In the reference implementation, iteration size depends on the number of given training audio files not the length; therewithal, learning rate decays with iterations to converge to global minima. However, our dataset and file organization are different and old hyper-parameters result in stop of learning. Therefore, below figure is the plot of new learning rates arranged to audio length for both generator and discriminator over growing epochs and iterations.

SPEED

After realized the training per epoch is so slow because of model-saving and validation operations; 'check_epoch' parameter is added to control them.
Validation functions for conversion from B-to-A is removed. (We only need A-to-B)

FIX

From now on, epoch range starts from 1 instead of 0.

MISCELLENOUS

All converted voices generated in validation are stored now; '-CONV-#-EPOCH' extension is added to converted voice filenames to observe the progress.
Elapsed time per epoch sensitivity is edited in the order-of-milliseconds.
Never-used scripts and folders are removed.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
data		data
figure		figure
model		model
LICENSE.md		LICENSE.md
README.md		README.md
convert.py		convert.py
hyparams.py		hyparams.py
model.py		model.py
module.py		module.py
preprocess.py		preprocess.py
run_colab.ipynb		run_colab.ipynb
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice Conversion by using CycleGAN

Changelog

TRAINING

SPEED

FIX

MISCELLENOUS

About

Languages

License

001honi/vc-cycle-gan

Folders and files

Latest commit

History

Repository files navigation

Voice Conversion by using CycleGAN

Changelog

TRAINING

SPEED

FIX

MISCELLENOUS

About

Topics

Resources

License

Stars

Watchers

Forks

Languages