Towards Automatic Music Structural Analysis Identifying Characteristic Within-Song Excerpts in Popular Music

Bee Suan Ong, Xavier Serra
2005 Zenodo  
Automatic audio content analysis is a general research area in which algorithms are developed to allow computer systems to understand the content of digital audio signals for further exploitations. The main focus therein is on the practical applications for audio files management, like automatic labeling, efficient browsing, or the retrieval of relevant files with little effort from a big database. Automatic music structural analysis is a specific subset of audio content analysis in which the
more » ... main of audio content is restricted to the semantically meaningful descriptions of audio in a musical context. The main task of automatic music structural analysis is to discover the structure of music by analyzing audio signals in order to facilitate a better handling of the current explosively expanding amounts of audio data available in digital collections. In this research work, we focus our investigation on two areas that are part of audiobased music structural analysis. First, we propose a unique framework and method for temporal audio segmentation at the semantic level. The system aims to detect the structural changes in music to provide a way to separate the different "sections" of a piece according to its structural titles (i.e. intro, verse, chorus, bridge, etc). We present a two-phase music segmentation system together with a combined set of lowlevel audio descriptors to be extracted form the music audio signals. Contrary to existing approaches, we consider the applicability of image processing methods in audio content analysis. A database of 54 audio files (The Beatles' song) is used for the evaluation of the proposed approach on a mainstream popular music collection. The experiment results demonstrate that our proposed algorithm has achieved 71% of accuracy and 79% of reliability in a practical application for identifying structural boundaries in music audio signals. Secondly, we present our proposed framework and approach for the identification of representative excerpts from music audio signals. The system a [...]
doi:10.5281/zenodo.3739314 fatcat:ptqosukiyvaxvdxr7sncrbtuhq