Filters








10,347 Hits in 3.8 sec

Robust Speech Recognition Using Generative Adversarial Networks [article]

Anuroop Sriram, Heewoo Jun, Yashesh Gaur, Sanjeev Satheesh
2017 arXiv   pre-print
This paper describes a general, scalable, end-to-end framework that uses the generative adversarial network (GAN) objective to enable robust speech recognition.  ...  We show the new approach improves simulated far-field speech recognition of vanilla sequence-to-sequence models without specialized front-ends or preprocessing.  ...  In this work, we employ the generative adversarial network (GAN) framework [5] to increase the robustness of seq-to-seq models [6] in a scalable, end-to-end fashion.  ... 
arXiv:1711.01567v1 fatcat:6poefrkl65gp5fjryydnaa2fi4

Adversarial Machine Learning And Speech Emotion Recognition: Utilizing Generative Adversarial Networks For Robustness [article]

Siddique Latif, Rajib Rana, Junaid Qadir
2018 arXiv   pre-print
We also explore potential defenses including adversarial training and generative adversarial network (GAN) to enhance robustness.  ...  Deep learning has undoubtedly offered tremendous improvements in the performance of state-of-the-art speech emotion recognition (SER) systems.  ...  Using Generative Adversarial Network Generative adversarial networks (GANs) [28] are deep models that learn to generate samples, ideally indistinguishable from the real data x, that are supposed to belong  ... 
arXiv:1811.11402v2 fatcat:ykjjg43e2rb7lkbxidv72o7uqq

Jointly Adversarial Enhancement Training for Robust End-to-End Speech Recognition

Bin Liu, Shuai Nie, Shan Liang, Wenju Liu, Meng Yu, Lianwu Chen, Shouye Peng, Changliang Li
2019 Interspeech 2019  
With the joint optimization of the recognition, enhancement and adversarial loss, the compositional scheme is expected to learn more robust representations for the recognition task automatically.  ...  Recently, the end-to-end system has made significant breakthroughs in the field of speech recognition.  ...  In addition, generative adversarial nets (GANs) [16] have been applied to speech enhancement [17, 18] and robust ASR [19, 20] , where the generator synthesizes increasingly more realistic data in  ... 
doi:10.21437/interspeech.2019-1242 dblp:conf/interspeech/LiuNLLYCPL19 fatcat:obprssglwrg55bl64lwgr4nc6e

Adversarial Separation Network for Speaker Recognition

Hanyi Zhang, Longbiao Wang, Yunchun Zhang, Meng Liu, Kong Aik Lee, Jianguo Wei
2020 Interspeech 2020  
In this study, we propose the adversarial separation network (AS-Net) to protect the speaker recognition system against adversarial attacks.  ...  Experimental results on the VCTK dataset demonstrated that the AS-Net effectively enhanced the robustness of speaker recognition systems against adversarial examples.  ...  APE-GAN [24] used generative adversarial network (GAN) [27] to eliminate adversarial perturbations.  ... 
doi:10.21437/interspeech.2020-1966 dblp:conf/interspeech/ZhangWZLLW20 fatcat:k7wt2xqjjjdslkddh7nb3yir44

Robust Audio Adversarial Example for a Physical Attack [article]

Hiromu Yakura, Jun Sakuma
2019 arXiv   pre-print
We propose a method to generate audio adversarial examples that can attack a state-of-the-art speech recognition model in the physical world.  ...  In contrast, our method obtains robust adversarial examples by simulating transformations caused by playback or recording in the physical world and incorporating the transformations into the generation  ...  To the best of our knowledge, this is the first approach to successfully generate audio adversarial examples for speech recognition models that use a recurrent network in the physical world.  ... 
arXiv:1810.11793v3 fatcat:hznszamzbjbzxfhe3ukqtw6rla

Adversarial Multi-Task Learning of Deep Neural Networks for Robust Speech Recognition

Yusuke Shinohara
2016 Interspeech 2016  
A method of learning deep neural networks (DNNs) for noise robust speech recognition is proposed.  ...  In this paper, we propose adversarial multi-task learning of DNNs for explicitly enhancing the invariance of representations.  ...  There has been no work, to the best of our knowledge, that used adversarial multi-task learning for speech recognition.  ... 
doi:10.21437/interspeech.2016-879 dblp:conf/interspeech/Shinohara16a fatcat:p7hrt75grjcx7cwvaaagywhqom

Boosting Noise Robustness of Acoustic Model via Deep Adversarial Training [article]

Bin Liu, Shuai Nie, Yaping Zhang, Dengfeng Ke, Shan Liang, Wenju Liu1
2018 arXiv   pre-print
Specifically, a jointly compositional scheme of generative adversarial net (GAN) and neural network-based acoustic model (AM) is used in the training phase.  ...  The joint optimization of generator, discriminator and AM concentrates the strengths of both GAN and AM for speech recognition.  ...  To the best of our knowledge, using GANs for robust speech recognition has not yet been studied, so our method is the first approach to use the adversarial training framework for robust speech recognition  ... 
arXiv:1805.01357v1 fatcat:o3wpy2tn55h3plhzdjk2fritxy

Mitigating Closed-model Adversarial Examples with Bayesian Neural Modeling for Enhanced End-to-End Speech Recognition [article]

Chao-Han Huck Yang, Zeeshan Ahmed, Yile Gu, Joseph Szurley, Roger Ren, Linda Liu, Andreas Stolcke, Ivan Bulyko
2022 arXiv   pre-print
In this work, we aim to enhance the system robustness of end-to-end automatic speech recognition (ASR) against adversarially-noisy speech examples.  ...  We propose an advanced Bayesian neural network (BNN) based adversarial detector, which could model latent distributions against adaptive adversarial perturbation with divergence measurement.  ...  speech recognition.  ... 
arXiv:2202.08532v1 fatcat:c373lqa6gfghbb6a2ktjcalgdi

Adjust-free adversarial example generation in speech recognition using evolutionary multi-objective optimization under black-box condition [article]

Shoma Ishida, Satoshi Ono
2020 arXiv   pre-print
Some studies have attempted to attack neural networks for speech recognition; however, these methods did not consider the robustness of generated adversarial examples against timing lag with a target speech  ...  This paper proposes a black-box adversarial attack method to automatic speech recognition systems.  ...  THE PROPOSED METHOD Key ideas This paper proposes a black-box adversarial attack method to speech recognition using DNN. Followings are the key ideas: 1.  ... 
arXiv:2012.11138v2 fatcat:6ngwwhwiqbhfzieka6mpmf5gpa

Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition [article]

Ludwig Kürzinger, Edgar Ricardo Chavez Rosas, Lujun Li, Tobias Watzel, Gerhard Rigoll
2020 arXiv   pre-print
Evaluation is performed using a hybrid CTC/attention end-to-end ASR model on two reference sentences as case study, as well as the TEDlium v2 speech recognition task.  ...  We then demonstrate the application of this algorithm for adversarial training to obtain a more robust ASR model.  ...  Results indicate improved robustness of the model against adversarial examples, as well as a generally improved speech recognition performance by a moderate 10% relative to the baseline model.  ... 
arXiv:2007.10723v1 fatcat:prou6j5d2famxkpr3nv6qk374e

Training Augmentation with Adversarial Examples for Robust Speech Recognition [article]

Sining Sun, Ching-Feng Yeh, Mari Ostendorf, Mei-Yuh Hwang, Lei Xie
2018 arXiv   pre-print
This paper explores the use of adversarial examples in training speech recognition systems to increase robustness of deep neural network acoustic models.  ...  During training, the fast gradient sign method is used to generate adversarial examples augmenting the original training data.  ...  Experiments Speech corpora and system description Aurora-4 corpus The Aurora-4 corpus is designed to evaluate the robustness of ASR systems on a medium vocabulary continuous speech recognition task  ... 
arXiv:1806.02782v2 fatcat:6vmxpmfbmzb3pjkvk6sbb3u4gm

Domain Adversarial Training for Accented Speech Recognition [article]

Sining Sun, Ching-Feng Yeh, Mei-Yuh Hwang, Mari Ostendorf, Lei Xie
2018 arXiv   pre-print
In this paper, we propose a domain adversarial training (DAT) algorithm to alleviate the accented speech recognition problem.  ...  Furthermore, we find that DAT is superior to multi-task learning for accented speech recognition.  ...  Recently, it has been adopted to tackle noise robust speech recognition as well [7, 8, 9] .  ... 
arXiv:1806.02786v1 fatcat:cmdie6p2jngblhyejfolefl2qy

Multi-Task Multi-Network Joint-Learning of Deep Residual Networks and Cycle-Consistency Generative Adversarial Networks for Robust Speech Recognition

Shengkui Zhao, Chongjia Ni, Rong Tong, Bin Ma
2019 Interspeech 2019  
Robustness of automatic speech recognition (ASR) systems is a critical issue due to noise and reverberations.  ...  To overcome this limit, the generative adversarial networks (GANs) and the adversarial training method are deployed, which have greatly simplified the model training process without the requirements of  ...  Cycle-consistency generative adversarial networks The cycle-consistency generative adversarial networks (Cycle-GANs) introduced by Zhu et al.  ... 
doi:10.21437/interspeech.2019-2078 dblp:conf/interspeech/ZhaoNTM19 fatcat:3bfy4hfeybgrtfqjo5buk4f43q

Training Augmentation with Adversarial Examples for Robust Speech Recognition

Sining Sun, Ching-Feng Yeh, Mari Ostendorf, Mei-Yuh Hwang, Lei Xie
2018 Interspeech 2018  
This paper explores the use of adversarial examples in training speech recognition systems to increase robustness of deep neural network acoustic models.  ...  During training, the fast gradient sign method is used to generate adversarial examples augmenting the original training data.  ...  Experiments Speech corpora and system description Aurora-4 corpus The Aurora-4 corpus is designed to evaluate the robustness of ASR systems on a medium vocabulary continuous speech recognition task  ... 
doi:10.21437/interspeech.2018-1247 dblp:conf/interspeech/SunYOHX18 fatcat:br6paxipmfbnti6ludbbena54i

Double Adversarial Network Based Monaural Speech Enhancement for Robust Speech Recognition

Zhihao Du, Jiqing Han, Xueliang Zhang
2020 Interspeech 2020  
To improve the noise robustness of automatic speech recognition (ASR), the generative adversarial network (GAN) based enhancement methods are employed as the front-end processing, which comprise a single  ...  In this paper, we propose a double adversarial network (DAN) by adding another adversarial generation process (AGP), which forces the discriminator not only to find the differences but also to model the  ...  Furthermore, ablation studies show that learning the speech distribution and using the proposed f -MSE are crucial for the robustness of speech recognition, which are missed in previous methods.  ... 
doi:10.21437/interspeech.2020-1504 dblp:conf/interspeech/DuHZ20 fatcat:fkserea7sfbe7mnxqnbroe6jbi
« Previous Showing results 1 — 15 out of 10,347 results