A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Vocal Pitch Extraction in Polyphonic Music Using Convolutional Residual Network
2019
Interspeech 2019
Pitch extraction, also known as fundamental frequency estimation, is a long-term task in audio signal processing. Especially, due to the presence of accompaniment, vocal pitch extraction in polyphonic music is more challenging. So far, most of deep learning approaches use log mel spectrogram as input, which neglect the phase information. In addition, shallow networks have been applied on waveform directly, which may not handle contaminated vocal data well. In this paper, a deep convolutional
doi:10.21437/interspeech.2019-2286
dblp:conf/interspeech/DongWL19
fatcat:jdda2ksquben3oyi4omqgf4f7m