Temporal Pyramid Pooling Convolutional Neural Network for Cover Song Identification

Zhesong Yu, Xiaoshuo Xu, Xiaoou Chen, Deshun Yang
2019 Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence  
Cover song identification is an important problem in the field of Music Information Retrieval. Most existing methods rely on hand-crafted features and sequence alignment methods, and further breakthrough is hard to achieve. In this paper, Convolutional Neural Networks (CNNs) are used for representation learning toward this task. We show that they could be naturally adapted to deal with key transposition in cover songs. Additionally, Temporal Pyramid Pooling is utilized to extract information on
more » ... different scales and transform songs with different lengths into fixed-dimensional representations. Furthermore, a training scheme is designed to enhance the robustness of our model. Extensive experiments demonstrate that combined with these techniques, our approach is robust against musical variations existing in cover songs and outperforms state-of-the-art methods on several datasets with low time complexity.
doi:10.24963/ijcai.2019/673 dblp:conf/ijcai/YuXCY19 fatcat:25yzxoepj5fapc7ory3h6ajo7y