Extraction of Speech-Relevant Information from Modulation Spectrograms [chapter]

Maria Markaki, Michael Wohlmayer, Yannis Stylianou
Lecture Notes in Computer Science  
In this work, we adopt an information theoretic approachthe Information Bottleneck method -to extract the relevant modulation frequencies across both dimensions of a spectrogram, for speech / non-speech discrimination (music, animal vocalizations, environmental noises). A compact representation is built for each sound ensemble, consisting of the maximally informative features. We demonstrate the effectiveness of a simple thresholding classifier which is based on the similarity of a sound to each characteristic modulation spectrum.
doi:10.1007/978-3-540-71505-4_5 dblp:conf/wnsp/MarkakiWS05 fatcat:2muwbqswkzgu5ffakklg3qse2e