Combining speech energy and edge information for fast and efficient voice activity detection in noisy environments

Xiaokun Li, Yunbin Deng
2008 Pattern Recognition (ICPR), Proceedings of the International Conference on  
Robust voice activity detection (VAD) is a very crucial step and a challenging problem in developing real-time and high-performance speech recognition systems used in noisy environments. In this paper, we present a novel and efficient VAD algorithm for robust and real-time speech activity detection. The key idea of the algorithm is considering speech energy and edge information simultaneously when processing speech signals. A new finite state Automaton is also developed for correctly detecting
more » ... orrectly detecting voice activities in noisy environments. Extensive and comparative experimental results show that the proposed VAD algorithm can greatly speed up speech recognition while reducing word error rate (WER) significantly. Compared with the state-of-the-art, the average improvement of using the proposed algorithm on noisy data is 46.5% for processing speed and 15.3% for WER.
doi:10.1109/icpr.2008.4761906 dblp:conf/icpr/LiD08 fatcat:mt5efzb5rjee7ctktjhvt753hq