Speech Synthesising based Attack for Automatic Speech Recognition

Our paper has been accepted by KDD-2022.
Our web-page containing generated audio demos is at SSA_Demo_page.
Our paper has been higlighted/reported by several Chinese social media, such as PaperWeekly, 语音之家，深科技，火山引擎.

Abstract

Adversarial examples in automatic speech recognition (ASR) are naturally sounded by humans \textit{yet} capable of fooling well trained ASR models to transcribe incorrectly. Existing audio adversarial examples are typically constructed by adding constrained perturbations on benign audio inputs. Such attacks are therefore generated with an audio dependent assumption. For the first time, we propose the Speech Synthesising based Attack (SSA), a novel threat model that constructs audio adversarial examples entirely from scratch, i.e., without depending on any existing audio to fool cutting-edge ASR models. To this end, we introduce a conditional variational auto-encoder (CVAE) as the speech synthesiser. Meanwhile, an adaptive sign gradient descent algorithm is proposed to solve the adversarial audio synthesis task. Experiments on three datasets (i.e., Audio Mnist, Common Voice, and Librispeech) show that our method could synthesise naturally sounded audio adversarial examples to mislead the start-of-the-art ASR models.

Build the environment

bash env_build.sh

Check the model is loaded properly

python3 cvae_attack_model_prepare.py

Run the speech synthesis attack with the command below.

python3 cvae_attack_mnist.py

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
art_Xinghua		art_Xinghua
configs		configs
deepspeech.pytorch		deepspeech.pytorch
filelists		filelists
monotonic_align		monotonic_align
text		text
README.md		README.md
attentions.py		attentions.py
commons.py		commons.py
cvae_attack_commonvoice.py		cvae_attack_commonvoice.py
cvae_attack_librispeech.py		cvae_attack_librispeech.py
cvae_attack_mnist.py		cvae_attack_mnist.py
cvae_attack_mnist_adam.py		cvae_attack_mnist_adam.py
cvae_attack_model_prepare.py		cvae_attack_model_prepare.py
data_utils.py		data_utils.py
env_build.sh		env_build.sh
losses.py		losses.py
mel_processing.py		mel_processing.py
models.py		models.py
models_new.py		models_new.py
modules.py		modules.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
transforms.py		transforms.py
tts_model.py		tts_model.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Synthesising based Attack for Automatic Speech Recognition

Abstract

Build the environment

Check the model is loaded properly

Run the speech synthesis attack with the command below.

About

Releases

Packages

Languages

xinghua-qu/SSA

Folders and files

Latest commit

History

Repository files navigation

Speech Synthesising based Attack for Automatic Speech Recognition

Abstract

Build the environment

Check the model is loaded properly

Run the speech synthesis attack with the command below.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages