A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Monaural Score-Informed Source Separation For Classical Music Using Convolutional Neural Networks
2017
Zenodo
In this paper we introduce a monaural score-informed source separation framework for Western classical music using convolutional neural networks (CNN). ...
"Monaural score-informed source separation for classical music using convolutional neural networks", 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017. ...
doi:10.5281/zenodo.1416498
fatcat:ik4enccj6netzde76dbfqhg62a
Generating Data To Train Convolutional Neural Networks For Classical Music Source Separation
2017
Proceedings of the SMC Conferences
Acknowledgments The TITANX used for this research was donated by the Proceedings of the 14th Sound and Music Computing Conference, July 5-8, Espoo, Finland SMC2017-232 ...
We propose a timbre-informed and score-constrained system to train neural networks for monaural source separation of classical music mixtures. ...
OUTLOOK We proposed a method to generate training data for timbreinformed source separation methods using neural networks, in the context of classical music. ...
doi:10.5281/zenodo.1401922
fatcat:z4bq6dksynhodglcsybhby34iu
End-to-End Sound Source Separation Conditioned On Instrument Labels
[article]
2019
arXiv
pre-print
Can we perform an end-to-end music source separation with a variable number of sources using a deep learning model? ...
This approach leads to other types of conditioning such as audio-visual source separation and score-informed source separation. ...
We would like to thank Terry Um and Eric Jang for their support during the camp, and Mar- ...
arXiv:1811.01850v2
fatcat:thwvytvuzbg7hchrrxu4tgayte
End-to-end Sound Source Separation Conditioned on Instrument Labels
2019
ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Can we perform an end-to-end sound source separation (SSS) with a variable number of sources using a deep learning model? ...
This paper presents an extension of the Wave-U-Net [1] model which allows end-to-end monaural source separation with a non-fixed number of sources. ...
ACKNOWLEDGEMENTS We would like to thank Terry Um and Eric Jang for their support during the camp, and the participants of Jeju Deep Learning Camp 2018 for useful discussions. ...
doi:10.1109/icassp.2019.8683800
dblp:conf/icassp/SlizovskaiaKHG19
fatcat:ampbcvbyt5hvdm64bugonz2g3i
Weakly Supervised Audio Source Separation via Spectrum Energy Preserved Wasserstein Learning
[article]
2018
arXiv
pre-print
Specifically, our loss function adopts the Wasserstein distance which directly measures the distribution distance between the separated sources and the real sources for each individual source. ...
We introduce a novel weakly supervised audio source separation approach based on deep adversarial learning. ...
(Huang et al. 2015) adopt a deep recurrent neural networks to separate monaural sources. ...
arXiv:1711.04121v3
fatcat:onszkgjofjfa3l5qsespl2f2mm
Weakly Supervised Audio Source Separation via Spectrum Energy Preserved Wasserstein Learning
2018
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Specifically, our loss function adopts the Wasserstein distance which directly measures the distribution distance between the separated sources and the real sources for each individual source. ...
Separating audio mixtures into individual instrument tracks has been a standing challenge. We introduce a novel weakly supervised audio source separation approach based on deep adversarial learning. ...
., 2015] adopt a deep recurrent neural networks to separate monaural sources. ...
doi:10.24963/ijcai.2018/636
dblp:conf/ijcai/ZhangYZ18a
fatcat:br6itktldzfclmjte4ktsfamqu
Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation
[article]
2021
arXiv
pre-print
A promising approach for speech dereverberation is based on supervised learning, where a deep neural network (DNN) is trained to predict the direct sound from noisy-reverberant speech. ...
., that reverberation results from a linear convolution between a room impulse response (RIR) and a dry source signal. ...
Lu, “Complex Spectrogram
quence Modeling for Time-Domain Single-Channel Speech Separation,” Enhancement by Convolutional Neural Network with Multi-Metrics
in Proc. ...
arXiv:2108.07376v2
fatcat:3obh4vconnfqrmez5fzp37qc6y
The Effects of Noisy Labels on Deep Convolutional Neural Networks for Music Tagging
[article]
2017
arXiv
pre-print
Using a trained network, we compute label vector similarities which is compared to groundtruth similarity. The results highlight several important aspects of music tagging and neural networks. ...
Deep neural networks (DNN) have been successfully applied to music classification including music tagging. ...
ACKNOWLEDGEMENTS This work is supported by EPSRC project (EP/L019981/1) 'Fusing Semantic and Audio Technologies for Intelligent Music Production and Consumption' and the European Commission H2020 research ...
arXiv:1706.02361v3
fatcat:aio2wo22r5hkfmdjeykj2dymtq
Evolving Multi-Resolution Pooling CNN for Monaural Singing Voice Separation
[article]
2020
arXiv
pre-print
Monaural Singing Voice Separation (MSVS) is a challenging task and has been studied for decades. Deep neural networks (DNNs) are the current state-of-the-art methods for MSVS. ...
Specifically, we propose a new multi-resolution Convolutional Neural Network (CNN) framework for MSVS namely Multi-Resolution Pooling CNN (MRP-CNN), which uses various-size pooling operators to extract ...
Monaural singing voice separation (MSVS), as an important research branch of music source separation (MSS), aims to separate the singing voice and the background music accompaniment from a single-channel ...
arXiv:2008.00816v1
fatcat:epkpzgcgtfeo7gwdgb5sotsqpm
Drum-Aware Ensemble Architecture for Improved Joint Musical Beat and Downbeat Tracking
[article]
2021
arXiv
pre-print
This paper presents a novel system architecture that integrates blind source separation with joint beat and downbeat tracking in musical audio signals. ...
The source separation module segregates the percussive and non-percussive components of the input signal, over which beat and downbeat tracking are performed separately and then the results are aggregated ...
Research on blind monaural source separation, which concerns with segregating the sound sources involved in a monaural audio recording, has also seen remarkable progress in recent years thanks to deep ...
arXiv:2106.08685v1
fatcat:q7j7cl3vkbeo5fnyttqrogodi4
Monoaural Audio Source Separation Using Variational Autoencoders
2018
Interspeech 2018
Traditionally, discriminative training for source separation is proposed using deep neural networks or non-negative matrix factorization. ...
We introduce a monaural audio source separation framework using a latent generative model. ...
Source separation is a classic problem and has wide applications in automatic speech recognition, biomedical imaging, and music editing. ...
doi:10.21437/interspeech.2018-1140
dblp:conf/interspeech/PandeyKN18
fatcat:5js7izrmcjdvbcoxo5px7o2x5i
Informing Piano Multi-Pitch Estimation with Inferred Local Polyphony Based on Convolutional Neural Networks
2021
Electronics
To that aim, we propose a method for local polyphony estimation (LPE), which is based on convolutional neural networks (CNNs) trained in a supervised fashion to explicitly predict the degree of polyphony ...
In this work, we propose considering the information from a polyphony for multi-pitch estimation (MPE) in piano music recordings. ...
Furthermore, LPE could be useful in training supervised music source separation approaches by introducing constraints, depending on the degree of polyphony, and thereby reinforcing neural networks for ...
doi:10.3390/electronics10070851
fatcat:y5bf3eamw5cvdewohgtosdofj4
Self-Supervised Generation of Spatial Audio for 360 Video
[article]
2018
arXiv
pre-print
Our system consists of end-to-end trainable neural networks that separate individual sound sources and localize them on the viewing sphere, conditioned on multi-modal analysis of audio and 360 video frames ...
Using our approach, we show that it is possible to infer the spatial location of sound sources based only on 360 video and a mono audio track. ...
For example, [18] proposes a recurrent neural-network for monaural separation of two speakers, [1, 12, 11] seek to isolate sound sources by leveraging synchronized visual information in addition to ...
arXiv:1809.02587v1
fatcat:2at6tsjuujaede254ueasdw3fu
Spatial Audio Scene Characterization (SASC): Automatic Localization of Front-, Back-, Up-, and Down-Positioned Music Ensembles in Binaural Recordings
2022
Applied Sciences
This paper demonstrates that the convolutional neural network (CNN) can be used to automatically localize music ensembles panned to the front, back, up, or down positions. ...
The network was developed using the repository of the binaural excerpts obtained by the convolution of multi-track music recordings with the selected sets of head-related transfer functions (HRTFs). ...
The VOICEBOX toolbox [40] was used in MATLAB to calculate the spectrograms.
Convolutional Neural Network
Network Topology The well-proven AlexNet topology [41] was adopted for this work. ...
doi:10.3390/app12031569
fatcat:ve4xpfmfhfcufd4onpsbtswfp4
Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks
2019
IEEE Journal on Selected Topics in Signal Processing
In this paper, we propose a convolutional recurrent neural network for joint sound event localization and detection (SELD) of multiple overlapping sound events in three-dimensional (3D) space. ...
The proposed method uses separately the phase and magnitude component of the spectrogram calculated on each audio channel as the feature, thereby avoiding any method- and array-specific feature extraction ...
Some of the classifiers include Gaussian mixture model (GMM)hidden Markov model (HMM) [27] , fully connected (FC) neural networks [28] , recurrent neural networks (RNN) [29] [30] [31] [32] , and convolutional ...
doi:10.1109/jstsp.2018.2885636
fatcat:rlips2i22ndv7mi4a4k726vhr4
« Previous
Showing results 1 — 15 out of 160 results