A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition
[article]
2020
arXiv
pre-print
We then demonstrate the application of this algorithm for adversarial training to obtain a more robust ASR model. ...
Recent advances in Automatic Speech Recognition (ASR) demonstrated how end-to-end systems are able to achieve state-of-the-art performance. ...
In speech recognition domain, working Audio Adversarial Examples (AAEs) were already demonstrated for CTC-based [5] , as well as for attention-based ASR systems [26] . ...
arXiv:2007.10723v1
fatcat:prou6j5d2famxkpr3nv6qk374e
MP3 Compression To Diminish Adversarial Noise in End-to-End Speech Recognition
[article]
2020
arXiv
pre-print
To this end, we generated AAEs with the Fast Gradient Sign Method for an end-to-end, hybrid CTC-attention ASR system. ...
Audio Adversarial Examples (AAE) represent specially created inputs meant to trick Automatic Speech Recognition (ASR) systems into misclassification. ...
FGSM was already applied in the context of end-to-end ASR to DeepSpeech [13, 7] , a CTC-based speech recognition system, as well as for the attention-based system called Listen-Attend-Spell (LAS) [8, ...
arXiv:2007.12892v1
fatcat:r3igokgsivhqvoecbiswxklphe
Adversarial Regularization for End-to-End Robust Speaker Verification
2019
Interspeech 2019
Next, we propose to train an end-toend robust SV model using the two proposed adversarial examples for model regularization. ...
It has been shown in image as well as speech applications that deep neural networks are vulnerable to adversarial examples. ...
Adversarial examples cannot only be used for attacking, but also can be used for improving robustness of speech recognition systems. ...
doi:10.21437/interspeech.2019-2983
dblp:conf/interspeech/WangGSXH19
fatcat:nanpmqzqujhodouehgtoa4ytme
Mitigating Closed-model Adversarial Examples with Bayesian Neural Modeling for Enhanced End-to-End Speech Recognition
[article]
2022
arXiv
pre-print
In this work, we aim to enhance the system robustness of end-to-end automatic speech recognition (ASR) against adversarially-noisy speech examples. ...
to the current model enhancement methods against the adversarial speech examples. ...
The authors thank Ariya Rastrow, Mat Hans and Björn Hoffmeister from Alexa AI for their valuable comments and discussion. ...
arXiv:2202.08532v1
fatcat:c373lqa6gfghbb6a2ktjcalgdi
AIPNet: Generative Adversarial Pre-training of Accent-invariant Networks for End-to-end Speech Recognition
[article]
2019
arXiv
pre-print
In this paper, our goal is to build a unified end-to-end speech recognition system that generalizes well across accents. ...
We further fine-tune AIPNet by connecting the accent-invariant module with an attention-based encoder-decoder model for multi-accent speech recognition. ...
In the fine-tuning stage, we adopt an attention-based encoder-decoder model for sequence-to-sequence speech recognition. ...
arXiv:1911.11935v1
fatcat:do2blvazhfdntoshwwwo4dran4
Introduction to the Special Issue "Speaker and Language Characterization and Recognition: Voice Modeling, Conversion, Synthesis and Ethical Aspects"
2019
Computer Speech and Language
In their article End-to-end DNN Based Text-Independent Speaker Recognition for Long and Short Utterances, Rohdin et al. proposed to mimic an i-vector/PLDA system using an end-to-end neural network to address ...
In their article Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition, Novotný et al. report the results of a detailed analysis of speaker verification noise robustness. ...
doi:10.1016/j.csl.2019.101021
fatcat:mpw674uefrbuxmrfmbvvcyphwi
2019 Index IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 27
2019
IEEE/ACM Transactions on Audio Speech and Language Processing
., +, TASLP March 2019 496-506 Linguistics Adversarial Regularization for Attention Based End-to-End Robust Speech Recognition. ...
., +, TASLP Sept. 2019 1455-1468 Gradient methods Adversarial Regularization for Attention Based End-to-End Robust Speech Recognition. ...
doi:10.1109/taslp.2020.2971902
fatcat:j66uwjyqlfbmtgda6zhzlswpva
Robust Speech Recognition Using Generative Adversarial Networks
[article]
2017
arXiv
pre-print
This paper describes a general, scalable, end-to-end framework that uses the generative adversarial network (GAN) objective to enable robust speech recognition. ...
We show the new approach improves simulated far-field speech recognition of vanilla sequence-to-sequence models without specialized front-ends or preprocessing. ...
For speech, [18] proposes a GAN based speech enhancement method called SEGAN but without the end goal of speech recognition. ...
arXiv:1711.01567v1
fatcat:6poefrkl65gp5fjryydnaa2fi4
End-to-end Domain-Adversarial Voice Activity Detection
[article]
2020
arXiv
pre-print
To that end, a domain classification branch is added to the network and trained in an adversarial manner. ...
In the in-domain scenario where the training and test sets cover the exact same domains, we show that the domain-adversarial approach does not degrade performance of the proposed end-to-end model. ...
We show that end-to-end voice activity detection leads to a significant improvement compared to models based on handcrafted features. ...
arXiv:1910.10655v2
fatcat:utt3bgphhzdfdbtmzred7nvjqy
Speaker Adaptation for Attention-Based End-to-End Speech Recognition
2019
Interspeech 2019
We propose three regularization-based speaker adaptation approaches to adapt the attention-based encoder-decoder (AED) model with very limited adaptation data from target speakers for end-to-end automatic ...
speech recognition. ...
Introduction Recently, remarkable progress has been made in end-to-end (E2E) automatic speech recognition (ASR) with the advance of deep learning. ...
doi:10.21437/interspeech.2019-3135
dblp:conf/interspeech/MengGLG19
fatcat:e7n3np6ibraibotx5hqbgndele
Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives [Review Article]
2019
IEEE Computational Intelligence Magazine
As a potentially crucial technique for the development of the next generation of emotional AI systems, we herein provide a comprehensive overview of the application of adversarial training to affective ...
o ver the past few years, adversarial training has become an extremely active research topic and has been successfully applied to various Artificial Intelligence (AI) domains. ...
[43] utilized CNNs pre-trained on large amounts of image data to extract robust feature representations for speech-based emotion recognition. ...
doi:10.1109/mci.2019.2901088
fatcat:edkvfgy3ofgufcytngf5mktpae
Real-time, Universal, and Robust Adversarial Attacks Against Speaker Recognition Systems
[article]
2020
arXiv
pre-print
In this paper, we propose the first real-time, universal, and robust adversarial attack against the state-of-the-art deep neural network (DNN) based speaker recognition system. ...
Through adding an audio-agnostic universal perturbation on arbitrary enrolled speaker's voice input, the DNN-based speaker recognition system would identify the speaker as any target (i.e., adversary-desired ...
RELATED WORK Adversarial Attack on Speech Recognition. ...
arXiv:2003.02301v2
fatcat:vzv2zftbtrhuxnwjcnxx4ymmty
Boosting Noise Robustness of Acoustic Model via Deep Adversarial Training
[article]
2018
arXiv
pre-print
To alleviate this issue, the commonest way is to use a well-designed speech enhancement approach as the front-end of ASR. ...
In this paper, we propose an adversarial training method to directly boost noise robustness of acoustic model. ...
To the best of our knowledge, using GANs for robust speech recognition has not yet been studied, so our method is the first approach to use the adversarial training framework for robust speech recognition ...
arXiv:1805.01357v1
fatcat:o3wpy2tn55h3plhzdjk2fritxy
End-to-End Domain-Adversarial Voice Activity Detection
2020
Interspeech 2020
To that end, a domain classification branch is added to the network and trained in an adversarial manner. ...
In the in-domain scenario where the training and test sets cover the exact same domains, we show that the domain-adversarial approach does not degrade performance of the proposed end-to-end model. ...
We would like to thank Neville Ryant for providing the speaker diarization output of the winning submission to DIHARD 2019.
References ...
doi:10.21437/interspeech.2020-2285
dblp:conf/interspeech/LavechinGBBG20
fatcat:ox2ibrrxhjgttbqibo2c53bgkm
Adversarial Separation Network for Speaker Recognition
2020
Interspeech 2020
Our proposed AS-Net is featured by its ability to separate adversarial perturbation from the test speech to restore the natural clean speech. ...
However, it is observed that DNN based systems are easily deceived by adversarial examples leading to wrong predictions. ...
Introduction The goal of speaker recognition is to determine the identity of a person through speech. Both the safety and robustness of speaker recognition systems have attracted much attention. ...
doi:10.21437/interspeech.2020-1966
dblp:conf/interspeech/ZhangWZLLW20
fatcat:k7wt2xqjjjdslkddh7nb3yir44
« Previous
Showing results 1 — 15 out of 5,931 results