Improved Accented Speech Recognition Using Accent Embeddings and Multi-task Learning

Abhinav Jain, Minali Upreti, Preethi Jyothi
2018 Interspeech 2018  
One of the major remaining challenges in modern automatic speech recognition (ASR) systems for English is to be able to handle speech from users with a diverse set of accents. ASR systems that are trained on speech from multiple English accents still underperform when confronted with a new speech accent. In this work, we explore how to use accent embeddings and multi-task learning to improve speech recognition for accented speech. We propose a multi-task architecture that jointly learns an
more » ... t classifier and a multi-accent acoustic model. We also consider augmenting the speech input with accent information in the form of embeddings extracted by a separate network. These techniques together give significant relative performance improvements of 15% and 10% over a multi-accent baseline system on test sets containing seen and unseen accents, respectively.
doi:10.21437/interspeech.2018-1864 dblp:conf/interspeech/JainUJ18 fatcat:mwaeo4e7vjdufoxbfpojnos6km