Understanding Internal Semantics Of Deep Learning Models For Electronic Music

Minz Sanghee Won
2017 Zenodo  
Since deep learning showed outstanding performance in the computer vision field, Music Information Retrieval (MIR) researchers also started to adopt these successful models in their research area. Unfortunately, a number of publications are still simply applying deep learning algorithms to any new problems or dataset without understanding their models. For sophisticated model design process, interpreting the architecture and the mechanism of hidden layers became more important, and it resulted
more » ... n multiple publications to propose methods for investigating learnt information in hidden layers: visualization, auralization, and playlist generation. However, due to the fact that proposed methods are very time-consuming processes, hidden layers still remain a black-box. In this paper, I propose two ideas to investigate hidden layers more efficiently, which are ranking tags and deriving filter importances. With conventional approaches and proposed methods, I investigate latent semantics learnt in hidden layers of deep learning models, particularly Convolutional Neural Networks (CNNs). A prototype experiment was processed with Ballroom dataset and the main experiment was done with Beatport dataset which consists of 15k electronic music. The experimental result reports latent semantics of learnt kernels from pre-trained CNNs for the electronic music genre classification, which was not mainly explored in a deep learning research.
doi:10.5281/zenodo.1100967 fatcat:fewtmbdobbepdbslgt2phdie5y