Robust end-of-utterance detection for real-time speech recognition applications

R. Hariharan, J. Hakkinen, K. Laurila
2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221)  
In this paper we propose a sub-band energy based end-ofutterance algorithm that is capable of detecting the time instant when the user has stopped speaking. The proposed algorithm finds the time instant at which many enough sub-band spectral energy trajectories fall and stay for a pre-defined fixed time below adaptive thresholds, i.e. a non-speech period is detected after the end of the utterance. With the proposed algorithm a practical speech recognition system can give timely feedback for the
more » ... ly feedback for the user, thereby making the behaviour of the speech recognition system more predictable and similar across different usage environments and noise conditions. The proposed algorithm is shown to be more accurate and noise robust than the previously proposed approaches. Experiments with both isolated command word recognition and continuous digit recognition in various noise conditions verify the viability of the proposed approach with an average proper endof-utterance detection rate of around 94% in both cases, representing 43% error rate reduction over the most competitive previously published method.
doi:10.1109/icassp.2001.940814 dblp:conf/icassp/HariharanHL01 fatcat:3qwzekdrdfdh7e4fuepp6i2wly