Speech Synthesis with Mamba

This the notebook presented in Lukas Nel's blog ported to the Determined AI platform, using Detached mode, with a few minor changes:

Performing some of the audio preprocessing steps using ffmpeg instead of moviepy
Padding the audio snippets with 0s or truncating if the snippet is not exactly 10s
Using Determined AI's Detached Mode for visualization and checkpointing instead of Weights & Biases

This script trains a Mamba model on audio snippets from a YouTube video or playlist link. After the model is trained, it generates new audio using a model checkpoint.

How to run:

Install Determined AI on any platform
Open the Web UI and launch JupyterLab
Set the environment to determinedai/environments:cuda-11.8-pytorch-2.0-gpu-0.26.5 in "Advanced Configuration Settings"
Change filepath variables at the top of the notebook to match your filesystem and desired audio generation settings (YouTube link, folder names for downloaded audio + snippets).
After the model is trained, in Section "Test out model", change the checkpoint directory to your desired checkpoint before generating new audio.

Some sample audio files that were generated by a model trained on this Youtube Video of Schmidt's Schmidttiest moments from New Girl are also included (these are memorized audio, some better audio is detailed in my blog post).

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
MambaSpeechSynthesis_Determined.ipynb		MambaSpeechSynthesis_Determined.ipynb
README.md		README.md
nadia.wav		nadia.wav
schmidt.wav		schmidt.wav

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Synthesis with Mamba

How to run:

About

Releases

Packages

Languages

ighodgao/mamba-speech-synthesis

Folders and files

Latest commit

History

Repository files navigation

Speech Synthesis with Mamba

How to run:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages