Multidimensional humming transcription using a statistical approach for query by humming systems

Hsuan-Huei Shih, S.S. Narayanan, C.-C. Jay Kuo
2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).  
A new statistical pattern recognition approach applied to human humming transcription is proposed in this research. A music note has two important attributes, i.e. pitch and duration. The proposed algorithm generates multidimensional humming transcriptions, which contain both pitch and duration information. Query by humming provides a natural means for content-based retrieval from music databases, and this research provides a robust frontend for such an application. The segment of a note in the
more » ... nt of a note in the humming waveform is modeled by a hidden Markov model (HMM) while the pitch of the note is modeled by a pitch model using a Gaussian mixture model. Preliminary real-time recognition experiments are carried out with models trained by data obtained from eight human objects, and an overall correct recognition rate of around 80% is demonstrated.
doi:10.1109/icassp.2003.1200026 dblp:conf/icassp/ShihNK03 fatcat:cuy42mj35vewtiyktthgkg7bxe