Utterance verification using prosodic information for Mandarin telephone speech keyword spotting

Yeou-Jiunn Chen, Chung-Hsien Wu, Gwo-Lang Yan
1999 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258)  
In this paper, the prosodic information, a very special and important feature in Mandarin speech, is used for Mandarin telephone speech utterance verification. A two-stage strategy, with recognition followed by verification, is adopted. For keyword recognition, 59 context-independent subsyllables, i.e., 22 m s and 37 FINAL'S in Mandarin speech, and one backgroundkilence model, are used as the basic recognition units. For utterance verification, 12 anti-subsyllable HMM's, 175 contextdependent
more » ... sodic HMM's, and five anti-prosodic HMM's. are constructed. A keyword verification function combining phonetic-phase and prosodic-phase verification is investigated. Using a test set of 2400 conversational speech utterances from 20 speakers (12 males and 8 females), at 8.5% false rejection, the proposed verification method resulted in 17.8% false alarm rate. Furthermore, this method was able to correctly reject 90.4% of nonkeywords. Comparison with a baseline system without prosodic-phase verification shows that the prosodic information can benefit the verification pelfOl7MllCe.
doi:10.1109/icassp.1999.759762 dblp:conf/icassp/ChenWY99 fatcat:zwnnm36it5dfritpavak5qapxu