Unified View of Prediction and Repetition Structure in Audio Signals With Application to Interest Point Detection

Shlomo Dubnov
2008 IEEE Transactions on Audio, Speech, and Language Processing  
In this paper we present a new method for analysis of musical structure that captures local prediction and global repetition properties of audio signals in one information processing framework. The method is motivated by a recent work in music perception where machine features were shown to correspond to human judgments of familiarity and emotional force when listening to music. Using a notion of information rate in a modelbased framework, we develop a measure of mutual information between past
more » ... and present in a time signal and show that it consist of two factors -prediction property related to data statistics within an individual block of signal features, and repetition property based on differences in model likelihood across blocks. The first factor, when applied to spectral representation of audio signals, is known as Spectral Anticipation, and the second factor is known as Recurrence Analysis. We present algorithms for estimation of these measures and create a visualization that displays their temporal structure in musical recordings. Considering these features as a measure of the amount of information processing that a listening system performs on a signal, information rate is used to detect interest points in music. Several musical works with different performances are analyzed in the paper, and their structure and interest points are displayed and discussed. Extensions of this approach towards a general framework of characterizing machine listening experience are suggested.
doi:10.1109/tasl.2007.912378 fatcat:lgqk4y7xwvgwhlzmlpmj4zh3am