Codes for the paper "On the robustness of non-intrusive speech quality model by adversarial examples".
This repository hosts the Pytorch codes for paper On the robustness of non-intrusive speech quality model by adversarial examples (ICASSP 2023) by Hsin-Yi Lin, Huan-Hsin Tseng, and Yu Tsao.
This work shows that deep speech quality predictors can be vulnerable to adversarial perturbations, where the prediction can be changed drastically by unnoticeable perturbations. In addition to exposing the vulnerability of deep speech quality predictors, we further explore and confirm the viability of adversarial training for strengthening robustness of models.
- Voice Bank corpus (VCTK)
- TIMIT Acoustic-Phonetic continuous speech corpus
- DNS challenge DNS challenge speech corpus
- change data paths, onnx model path, save model path, output path
- change score transform (in attack_modules.py)
- run stage1_attack.py
- change data paths, onnx model path, perturbation paths, save model path
- run stage2_enhancement.py
- Python 3.7
- PyTorch 1.11
- librosa 0.9
- Tensorboardx 2.5
- scikit-learn 1.0
- tqdm 4.64
- numpy 1.21
- torchaudio 0.11
- scipy 1.6
- audioread 2.1
- NVIDIA V100 (32 GB CUDA memory) and 4 CPUs.