Investigating Kernel Shapes and Skip Connections for Deep Learning-Based Harmonic-Percussive Separation

Carlos Lordelo, Emmanouil Benetos, Simon Dixon, Sven Ahlback
2019 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)  
In this paper we propose an efficient deep learning encoder-decoder network for performing Harmonic-Percussive Source Separation (HPSS). It is shown that we are able to greatly reduce the number of model trainable parameters by using a dense arrangement of skip connections between the model layers. We also explore the utilisation of different kernel sizes for the 2D filters of the convolutional layers with the objective of allowing the network to learn the different time-frequency patterns
more » ... iated with percussive and harmonic sources more efficiently. The training and evaluation of the separation has been done using the training and test sets of the MUSDB18 dataset. Results show that the proposed deep network achieves automatic learning of high-level features and maintains HPSS performance at a state-of-the-art level while reducing the number of parameters and training time.
doi:10.1109/waspaa.2019.8937079 dblp:conf/waspaa/LordeloBDA19 fatcat:2ormcydurvcl3ohif56dgpvvda