Voice activity detection in presence of transient noise using spectral clustering and diffusion kernels

Oren Rosen, Saman Mousazadeh, Israel Cohen
2014 2014 IEEE 28th Convention of Electrical & Electronics Engineers in Israel (IEEEI)  
In this paper, we introduce a voice activity detection (VAD) algorithm based on spectral clustering and diffusion kernels. The proposed algorithm is a supervised learning algorithm comprising of learning and testing stages: A sample cloud is produced for every signal frame by utilizing a moving window. Mel-frequency cepstrum coefficients (MFCCs) are then calculated for every sample in the cloud in order to produce an MFCC matrix and subsequently a covariance matrix for every frame. Utilizing
more » ... frame. Utilizing the covariance matrix, we calculate a similarity matrix using spectral clustering and diffusion kernels methods. Using the similarity matrix, we cluster the data and transform it to a new space where each point is labeled as speech or nonspeech. We then use a Gaussian Mixture Model (GMM) in order to build a statistical model for labeling data as speech or nonspeech. Simulation results demonstrate its advantages compared to a recent VAD algorithm.
doi:10.1109/eeei.2014.7005743 fatcat:hhqrcli535c6bpmmyo3pux3k2q