A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
End-to-End Multilingual Speech Recognition System with Language Supervision Training
2020
IEICE transactions on information and systems
End-to-end (E2E) multilingual automatic speech recognition (ASR) systems aim to recognize multilingual speeches in a unified framework. In the current E2E multilingual ASR framework, the output prediction for a specific language lacks constraints on the output scope of modeling units. In this paper, a language supervision training strategy is proposed with language masks to constrain the neural network output distribution. To simulate the multilingual ASR scenario with unknown language identity
doi:10.1587/transinf.2019edl8214
fatcat:itsc4hdm6rf2tnadzw7vmkb2s4