Automatic Identification of the Sung Language in Popular Music Recordings

Wei-Ho Tsai, Hsin-Min Wang
2007 Journal of New Music Research  
As part of the research into content-based music information retrieval (MIR), this paper presents a preliminary attempt to automatically identify the language sung in popular music recordings. It is assumed that each language has its own set of constraints that specify the sequence of basic linguistic events when lyrics are sung. Thus, the acoustic structure of individual languages may be characterized by statistically modeling those constraints. To achieve this, the proposed method employs
more » ... or clustering to convert a singing signal from its spectrum-based feature representation into a sequence of smaller basic phonological units. The dynamic characteristics of the sequence are then analyzed using bigram language models. As vector clustering is performed in an unsupervised manner, the resulting system does not need sophisticated linguistic knowledge; therefore, it is easily portable to new language sets. In addition, to eliminate interference from background music, we leverage the statistical estimation of the background musical accompaniment of a song so that the vector clustering truly reflects the solo singing voices in the accompanied signals.
doi:10.1080/09298210701755206 fatcat:2ojmlqid3bcxjcaweofnjbrunq