Filters








1,045 Hits in 8.8 sec

Exploring Data Augmentation For Improved Singing Voice Detection With Neural Networks

Jan Schlüter, Thomas Grill
2015 Zenodo  
Last but not least, we thank Bernhard Lehner for fruitful discussions on singing voice detection.  ...  We also gratefully acknowledge the support of NVIDIA Corporation with the donation of a Tesla K40 GPU used for this research.  ...  "Exploring Data Augmentation for Improved Singing Voice Detection with Neural Networks", 16th International Society for Music Information Retrieval Conference, 2015. music information retrieval (MIR) -  ... 
doi:10.5281/zenodo.1417745 fatcat:qahjap4ypncxtlucfhbej36fs4

Melody Extraction On Vocal Segments Using Multi-Column Deep Neural Networks

Sangeun Kum, Changheun Oh, Juhan Nam
2016 Zenodo  
Pitch shifting proved to be an effective method of data augmentation for singing voice detection [15] . We will show that it works for singing melody extraction as well.  ...  Singing Voice Detection The MCDNN is trained with only voiced frames for pitch classification. Therefore, a separate singing voice detection step is necessary for the test phase.  ... 
doi:10.5281/zenodo.1414788 fatcat:7a65jbrknndivh6pnrt5aewvye

Zero-Mean Convolutions for Level-Invariant Singing Voice Detection

Jan Schlüter, Bernhard Lehner
2018 Zenodo  
As recently shown, such detectors have an important weakness: Since singing voice is correlated with sound level in training data, classifiers learn to become sensitive to input magnitude, and give different  ...  State-of-the-art singing voice detectors are based on classifiers trained on annotated examples.  ...  We also gratefully acknowledge the support of NVIDIA Corporation with the donation of two Tesla K40 GPUs and a Titan Xp GPU used for this research.  ... 
doi:10.5281/zenodo.1492412 fatcat:6cxxhrtukfgo5cf66adhs6r7jq

Transfer Learning for Improving Singing-Voice Detection in Polyphonic Instrumental Music

Yuanbo Hou, Frank K. Soong, Jian Luan, Shengchen Li
2020 Interspeech 2020  
By transferring the related knowledge to make up for the lack of well-labeled training data in S-VD, the proposed data augmentation method by transfer learning can improve S-VD performance with an F-score  ...  Hence, we propose a data augmentation method for S-VD by transfer learning.  ...  So a convolutional recurrent neural network (CRNN) is trained with a small set of data collected in the target task to detect the vocal frames.  ... 
doi:10.21437/interspeech.2020-1806 dblp:conf/interspeech/HouSLL20 fatcat:5matlrxh55bgrks2a6k3qs3n5u

Transfer Learning for Improving Singing-voice Detection in Polyphonic Instrumental Music [article]

Yuanbo Hou, Frank K. Soong, Jian Luan, Shengchen Li
2020 arXiv   pre-print
By transferring the related knowledge to make up for the lack of well-labeled training data in S-VD, the proposed data augmentation method by transfer learning can improve S-VD performance with an F-score  ...  Hence, we propose a data augmentation method for S-VD by transfer learning.  ...  So a convolutional recurrent neural network (CRNN) is trained with a small set of data collected in the target task to detect the vocal frames.  ... 
arXiv:2008.04658v1 fatcat:2dfxecqo6jadxovo4r5w6g6lhm

Melody Extraction from Polyphonic Music by Deep Learning Approaches: A Review [article]

Gurunath Reddy M and K. Sreenivasa Rao and Partha Pratim Das
2022 arXiv   pre-print
The available deep models have been categorized based on the type of neural network used and the output representation they use for predicting melody.  ...  The possible future directions to explore/improve the melody extraction methods are also presented in the paper.  ...  Authors of [34] proposed the deep CNN-based parallel networks for singing pitch extraction and singing voice detection.  ... 
arXiv:2202.01078v1 fatcat:ptmc2gl455ezrburr7lvbmpxqq

SVSGAN: Singing Voice Separation via Generative Adversarial Network [article]

Zhe-Cheng Fan, Yen-Lin Lai, Jyh-Shing Roger Jang
2017 arXiv   pre-print
In this paper, we propose a novel framework for singing voice separation using the generative adversarial network (GAN) with a time-frequency masking function.  ...  Experimental results on three datasets (MIR-1K, iKala and DSD100) show that performance can be improved by the proposed framework consisting of conventional networks.  ...  Post-processing with a Wiener filter at the output of neural networks and data augmentation [19] have been proposed to separate vocals and instruments.  ... 
arXiv:1710.11428v2 fatcat:75aswzvymje6pk2cmbhfu7pvo4

Semi-supervised learning using teacher-student models for vocal melody extraction

Sangeun Kum, Jing-Hua Lin, Li Su, Juhan Nam
2020 Zenodo  
We examine three setups of teacher-student models with different data augmentation schemes and loss functions.  ...  Finally, we show that the proposed SSL method allows a simple convolutional recurrent neural network model to achieve performance comparable to state-of-the-arts.  ...  However, it was not a self-training setting where the teacher model is repeatedly replaced with an improved student model. Schlüter explored the self-training for singing voice detection [17] .  ... 
doi:10.5281/zenodo.4245374 fatcat:bsxj3gbl4nfxnemfyx2reozed4

Semi-supervised learning using teacher-student models for vocal melody extraction [article]

Sangeun Kum, Jing-Hua Lin, Li Su, Juhan Nam
2020 arXiv   pre-print
We examine three setups of teacher-student models with different data augmentation schemes and loss functions.  ...  Finally, we show that the proposed SSL method enables a baseline convolutional recurrent neural network model to achieve performance comparable to state-of-the-arts.  ...  However, it was not a self-training setting where the teacher model is repeatedly replaced with an improved student model. Schlüter explored the self-training for singing voice detection [17] .  ... 
arXiv:2008.06358v1 fatcat:ipl4gguzcfhr7aifchpfilx7xu

Addressing the confounds of accompaniments in singer identification [article]

Tsung-Han Hsieh, Kai-Hsiang Cheng, Zhe-Cheng Fan, Yu-Ching Yang, Yi-Hsuan Yang
2020 arXiv   pre-print
Evaluation results on a benchmark dataset called the artist20 shows that this data augmentation method greatly improves the accuracy of singer identification.  ...  We then investigate two means to train a singer identification model: by learning from the separated vocal only, or from an augmented set of data where we "shuffle-and-remix" the separated vocal tracks  ...  It has also been shown beneficial for MIR tasks such as singing voice detection and source separation [20] [21] [22] (but not yet for SID).  ... 
arXiv:2002.06817v1 fatcat:aleufy5aaffu3fyj7oguhraotm

Revisiting Singing Voice Detection: A quantitative review and the future outlook

Kyungyun Lee, Keunwoo Choi, Juhan Nam
2018 Zenodo  
Although several proposed algorithms have shown high performances, we argue that there is still room for improving the singing voice detection system.  ...  In order to identify the area of improvement, we first perform an error analysis on three recent singing voice detection systems.  ...  a convolutional neural network (CNN) [27] and a recurrent neural network (RNN) [11] .  ... 
doi:10.5281/zenodo.1492462 fatcat:53h3tjfovjdw3bsentsgg7hmpu

Revisiting Singing Voice Detection: a Quantitative Review and the Future Outlook [article]

Kyungyun Lee, Keunwoo Choi, Juhan Nam
2018 arXiv   pre-print
In order to identify the area of improvement, we first perform an error analysis on three recent singing voice detection systems.  ...  Although several proposed algorithms have shown high performances, we argue that there still is a room to improve to build a more robust singing voice detection system.  ...  ACKNOWLEDGEMENTS We thank Bernhard Lehner and Simon Leglaive for active discussion and code, Jeongsoo Park for sharing Ono's code.  ... 
arXiv:1806.01180v1 fatcat:5vwakdflyzde5fvdayilmz66lm

Knowledge Distillation for Singing Voice Detection [article]

Soumava Paul, Gurunath Reddy M, K Sreenivasa Rao, Partha Pratim Das
2021 arXiv   pre-print
Currently, two deep neural network-based methods, one based on CNN and the other on RNN, exist in literature that learn optimized features for the voice detection (VD) task and achieve state-of-the-art  ...  Singing Voice Detection (SVD) has been an active area of research in music information retrieval (MIR).  ...  Early voice detection approaches [4, 5, 6] usually relied on complex hand-engineered audio features, which have now gone out of favour with the advent of deep neural networks.  ... 
arXiv:2011.04297v2 fatcat:v5kxtkpwz5hbfmfo7pit6dqefi

Learning To Pinpoint Singing Voice From Weakly Labeled Examples

Jan Schlüter
2016 Zenodo  
DISCUSSION We have explored how to train CNNs for singing voice detection on coarsely annotated training data and still obtain temporally accurate predictions, closely matching performance of a network  ...  of singing voice with sub-second granularity.  ... 
doi:10.5281/zenodo.1417650 fatcat:25p7nki3czhwvm35fai4aueetq

Multiple F0 estimation in vocal ensembles using convolutional neural networks

Helena Cuesta, Brian McFee, Emilia Gomez
2020 Zenodo  
data configurations, including recordings with additional reverb.  ...  This paper addresses the extraction of multiple F0 values from polyphonic and a cappella vocal performances using convolutional neural networks (CNNs).  ...  The authors would like to thank Rodrigo Schramm and Emmanouil Benetos for sharing the BSQ and BC datasets for this research.  ... 
doi:10.5281/zenodo.4245434 fatcat:b2wpxk4e2vdktks43cpmfh5pvm
« Previous Showing results 1 — 15 out of 1,045 results