Speaker-independent dictation of Chinese speech with 32k vocabulary

Bo Xu, Bing Ma, Shuwu Zhang, Fei Qu, Taiyi Huang
1996 4th International Conference on Spoken Language Processing (ICSLP 1996)   unpublished
While early machines adopted isolated syllable as input units and needed boring enrollment, our research focus on the speaker-independent, word-based dictation. A deliberately designed 120-speaker database was built for training ; inter-syllable context ,tonal and endpoint dependent acoustic model are applied with promising MFCC feature; Two-pass acoustic matching accelerates the recognition making fully advantage of the monosyllabic structure of Chinese speech; A complete word bigram and
more » ... m serve as language processing module. With all efforts, the system reaches 90% character accuracy performing in almost real-time on Pentium PC without DSP help.
doi:10.21437/icslp.1996-587 fatcat:w6jo4ng5ajbb5gflmvwdqw5rq4