Saliency-driven unstructured acoustic scene classification using latent perceptual indexing

Ozlem Kalinli, Shiva Sundaram, Shrikanth Narayanan
2009 2009 IEEE International Workshop on Multimedia Signal Processing  
Automatic acoustic scene classification of real life, complex and unstructured acoustic scenes is a challenging task as the number of acoustic sources present in the audio stream are unknown and overlapping in time. In this work, we present a novel approach to classification such unstructured acoustic scenes. Motivated by the bottom-up attention model of the human auditory system, salient events of an audio clip are extracted in an unsupervised manner and presented to the classification system.
more » ... Similar to latent semantic indexing of text documents, the classification system uses unit-document frequency measure to index the clip in a continuous, latent space. This allows for developing a completely class-independent approach to audio classification. Our results on the BBC sound effects library indicates that using the saliency-driven attention selection approach presented in this paper, 17.5% relative improvement can be obtained in frame-based classification and 25% relative improvement can be obtained using the latent audio indexing approach.
doi:10.1109/mmsp.2009.5293267 dblp:conf/mmsp/KalinliSN09 fatcat:5y2m36c57bgs5azzj4lnvg3yre