Music Source Separation Using Stacked Hourglass Networks

Sungheon Park, Taehoon Kim, Kyogu Lee, Nojun Kwak
2018 Zenodo  
In this paper, we propose a simple yet effective method for multiple music source separation using convolutional neural networks. Stacked hourglass network, which was originally designed for human pose estimation in natural images, is applied to a music source separation task. The network learns features from a spectrogram image across multiple scales and generates masks for each music source. The estimated mask is refined as it passes over stacked hourglass modules. The proposed framework is
more » ... le to separate multiple music sources using a single network. Experimental results on MIR-1K and DSD100 datasets validate that the proposed method achieves competitive results comparable to the state-of-the-art methods in multiple music source separation and singing voice separation tasks.
doi:10.5281/zenodo.1492404 fatcat:dosh3dtjdnforatu2rs6svwize