A novel approach in continuous speech recognition for Vietnamese, an isolating tonal language

Hong Quang Nguyen, Pascal Nocera, Eric Castelli, Van Loan Trinh
2008 Interspeech 2008   unpublished
This paper proposes a new approach for the integration of the Vietnamese language characteristics into a Large Vocabulary Continuous Speech Recognition System (LVCSR) which was built for some European languages. Firstly, a new module of tone recognition using Hidden Markov model was constructed. Secondly, several methods were applied to transform a text corpus of monosyllabic words into text corpus of polysyllabic words and a statistical language model of polysyllabic words was built by using
more » ... e new text corpus. Finally, all the knowledge has been included in the LVCSR system so that this system can be adapted for Vietnamese. Experiments are made on the VNSPEECHCORPUS. The results show that the accuracy of Vietnamese recognition system was increased, 46% of relative reduction of the word error rate is obtained by using Vietnamese language characteristics.
doi:10.21437/interspeech.2008-349 fatcat:fup4nnrd2nbspph46esx5ef6kq