Online Incremental Learning for Speaker-Adaptive Language Models

Chih Chi Hu, Bing Liu, John Shen, Ian Lane
2018 Interspeech 2018  
Voice control is a prominent interaction method on personal computing devices. While automatic speech recognition (ASR) systems are readily applicable for large audiences, there is room for further adaptation at the edge, ie. locally on devices, targeted for individual users. In this work, we explore improving ASR systems over time through a user's own interactions. Our online learning approach for speaker-adaptive language modeling leverages a user's most recent utterances to enhance the
more » ... r dependent features and traits. We experiment with the Large-Vocabulary Continuous Speech Recognition corpus Tedlium v2, and demonstrate an average reduction in perplexity (PPL) of 19.18% and average relative reduction in word error rate (WER) of 2.80% compared to a state-of-the-art baseline on Tedlium v2.
doi:10.21437/interspeech.2018-2259 dblp:conf/interspeech/HuLSL18 fatcat:b3cmikob7ngzrpusayg53veyqy