DOTN: Discriminator-Constrained Optimal Transport Network

This repository hosts the Pytorch codes for paper Unsupervised Noise Adaptive Speech Enhancement by Discriminator-Constrained Optimal Transport (NIPS 2021) by Hsin-Yi Lin, Huan-Hsin Tseng, Xugang Lu and Yu Tsao.

Model

DOTN performs unsupervised domain adaptation for speech enhancement (SE), using optimal transport (OT) for domain alignment and Wasserstein Generative Adversarial Network (WGAN) to goven the output speech quality.

Datasets & Preprocessing

- Voice Bank corpus (VCTK)

In Data_preprocessing/processing_VCTK_Demand:

Download clean_trainset_28spk_wav and clean_testset_wav (two subsets of VCTK) and put them together in a larger folder, e.g., VCTK_noisy.
Use preselected DEMAND noise files in .../Data_preprocessing/processing_VCTK_Demand/DEMAND
- More noises from DEMAND (16-channel environmental noise recordings) can also be used, with modification required.
Run step1_process_noisy_VCTK_16k.py to generate training and testing dataset: Add paths of VCTK and DEMAND (noise) in VCTK_path & noise_path , and select noise types in source_noise & target_noise as desired. e.g. source_noise = ["TBUS", "TCAR", "TMETRO"], target_noise = ["SCAFE"].
Convert generated .wav files to .pt files using step2_convert_to_pt.py: Add .wav folder path in target_root.

- TIMIT Acoustic-Phonetic Continuous Speech Corpus

In Data_preprocessing/preprocessing_TIMIT:

Download TIMIT corpus, and put TIMIT path in step1_generate_clean_files.py to generate clean speech
Add path of noise_types folder in step2_add_noise.py to mix clean speech with noise
Convert generated .wav files to .pt files using step3_convert_to_pt.py

Run DOTN

For both cases (VCTK/TIMIT), provide generated data paths data_path & pt_data_path in the corresponding main.py, and run python main.py

Prerequisites

Hardware

NVIDIA V100 (32 GB CUDA memory) and 4 CPUs.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
DOTN		DOTN
Data_preprocessing		Data_preprocessing
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DOTN: Discriminator-Constrained Optimal Transport Network

Model

Datasets & Preprocessing

- Voice Bank corpus (VCTK)

- TIMIT Acoustic-Phonetic Continuous Speech Corpus

Run DOTN

Prerequisites

Hardware

About

Releases

Packages

Languages

hsinyilin19/Discriminator-Constrained-Optimal-Transport-Network

Folders and files

Latest commit

History

Repository files navigation

DOTN: Discriminator-Constrained Optimal Transport Network

Model

Datasets & Preprocessing

- Voice Bank corpus (VCTK)

- TIMIT Acoustic-Phonetic Continuous Speech Corpus

Run DOTN

Prerequisites

Hardware

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages