A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Filters
Monoaural Audio Source Separation Using Deep Convolutional Neural Networks
[chapter]
2017
Lecture Notes in Computer Science
In this paper we introduce a low-latency monaural source separation framework using a Convolutional Neural Network (CNN). ...
We use a CNN to estimate time-frequency soft masks which are applied for source separation. ...
Acknowledgments: The TITANX used for this research was donated by the NVIDIA Corporation. ...
doi:10.1007/978-3-319-53547-0_25
fatcat:k254yklw35f5phskmdu4avaniu
WildMix Dataset and Spectro-Temporal Transformer Model for Monoaural Audio Source Separation
[article]
2019
arXiv
pre-print
Monoaural audio source separation is a challenging research area in machine learning. ...
In this paper, we first introduce a challenging new dataset for monoaural source separation called WildMix. ...
DNN is a baseline that uses a fully connected deep neural network for separating the sources within a mixture (Grais, Sen, and Erdogan 2014) . ...
arXiv:1911.09783v1
fatcat:b2qboubulrd7rnklsxolhpmdma
Music Source Separation Using Stacked Hourglass Networks
[article]
2018
arXiv
pre-print
In this paper, we propose a simple yet effective method for multiple music source separation using convolutional neural networks. ...
The proposed framework is able to separate multiple music sources using a single network. ...
Monoaural audio source separation using deep Figure 2 . 2 Figure 2. Overall music source separation framework proposed in this paper. ...
arXiv:1805.08559v2
fatcat:rm3izf5p2rcabauhl7oowxspre
Music Source Separation Using Stacked Hourglass Networks
2018
Zenodo
In this paper, we propose a simple yet effective method for multiple music source separation using convolutional neural networks. ...
The proposed framework is able to separate multiple music sources using a single network. ...
Monoaural audio source separation using deep Figure 2 . 2 Figure 2. Overall music source separation framework proposed in this paper. ...
doi:10.5281/zenodo.1492404
fatcat:dosh3dtjdnforatu2rs6svwize
Weakly Supervised Audio Source Separation via Spectrum Energy Preserved Wasserstein Learning
[article]
2018
arXiv
pre-print
We introduce a novel weakly supervised audio source separation approach based on deep adversarial learning. ...
Separating audio mixtures into individual instrument tracks has been a long standing challenging task. ...
(Grais and Plumbley 2017) develop a deep convolutional denoising auto-encoder for monoaural audio source separation, which outperforms the deep feedforward neural networks. ...
arXiv:1711.04121v3
fatcat:onszkgjofjfa3l5qsespl2f2mm
On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training
[article]
2022
arXiv
pre-print
In this paper, we explore an improved framework to train a monoaural neural enhancement model for robust speech recognition. ...
It is found that the unpaired clean speech is crucial to improve quality of separated speech from real noisy speech. ...
Neural enhancement network The dense U-Net temporal convolutional network proposed by Wang et al. [2] is used for the monoaural speech enhancement. ...
arXiv:2205.01751v1
fatcat:ekpcd3bxxff23fvpvx2mym34fa
Monoaural Audio Source Separation Using Variational Autoencoders
2018
Interspeech 2018
Traditionally, discriminative training for source separation is proposed using deep neural networks or non-negative matrix factorization. ...
We introduce a monaural audio source separation framework using a latent generative model. ...
Recently, a deep (stacked) fully convolutional DAEs (CDAEs) is used for the audio single channel source separation (SCSS) [14] . ...
doi:10.21437/interspeech.2018-1140
dblp:conf/interspeech/PandeyKN18
fatcat:5js7izrmcjdvbcoxo5px7o2x5i
Investigating Kernel Shapes and Skip Connections for Deep Learning-Based Harmonic-Percussive Separation
2019
2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
In this paper we propose an efficient deep learning encoder-decoder network for performing Harmonic-Percussive Source Separation (HPSS). ...
The training and evaluation of the separation has been done using the training and test sets of the MUSDB18 dataset. ...
instrumental sources (drums, vocals, bass and other) is based on Deep Neural Networks (DNNs). ...
doi:10.1109/waspaa.2019.8937079
dblp:conf/waspaa/LordeloBDA19
fatcat:2ormcydurvcl3ohif56dgpvvda
Investigating kernel shapes and skip connections for deep learning-based harmonic-percussive separation
[article]
2019
arXiv
pre-print
In this paper we propose an efficient deep learning encoder-decoder network for performing Harmonic-Percussive Source Separation (HPSS). ...
The training and evaluation of the separation has been done using the training and test sets of the MUSDB18 dataset. ...
instrumental sources (drums, vocals, bass and other) is based on Deep Neural Networks (DNNs). ...
arXiv:1905.01899v2
fatcat:p352pl5vzzderexkq5n6nu4tv4
Weakly Supervised Audio Source Separation via Spectrum Energy Preserved Wasserstein Learning
2018
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Separating audio mixtures into individual instrument tracks has been a standing challenge. We introduce a novel weakly supervised audio source separation approach based on deep adversarial learning. ...
Specifically, our loss function adopts the Wasserstein distance which directly measures the distribution distance between the separated sources and the real sources for each individual source. ...
[Chandna et al., 2017] demonstrate the superiority of convolutional neural network (CNN) in monoaural audio source separation. ...
doi:10.24963/ijcai.2018/636
dblp:conf/ijcai/ZhangYZ18a
fatcat:br6itktldzfclmjte4ktsfamqu
Quality Enhancement of Overdub Singing Voice Recordings
2021
Zenodo
In this work, two neural network architectures for speech denoising – namely FullSubNet and Wave-U-Net – were trained and evalu-ated specifically on denoising of user singing voice recordings. ...
Wave-U-Net As a derivative of 2D U-Net source separation architectures as in [29] , Wave-U-Net [21] adapts the concept of deep convolutional networks for time-domain processing of audio signals. ...
In this, conventional approaches to AEC like adaptive filtering are being replaced by deep neural networks or improved with the use of artificial intelligence [17] . ...
doi:10.5281/zenodo.5553906
fatcat:elv437mgfvc63j6ktgxgdgiahq
Sound Event Detection Using Spatial Features and Convolutional Recurrent Neural Network
[article]
2017
arXiv
pre-print
We extend the convolutional recurrent neural network to handle more than one type of these multichannel features by learning from each of them separately in the initial stages. ...
This paper proposes to use low-level spatial features extracted from multichannel audio for sound event detection. ...
nel audio source separation with deep neural networks,”
in IEEE/ACM Transactions on Audio, Speech, and Lan- [21] “Detection and classification of acoustic scenes
guage Processing, 2016. ...
arXiv:1706.02291v1
fatcat:wzwmiskifvdrddlp5e4tuxpnxm
SLNSpeech: solving extended speech separation problem by the help of sign language
[article]
2020
arXiv
pre-print
Specifically, we use 3D residual convolutional network to extract sign language features and use pretrained VGGNet model to exact visual features. ...
Then, we design a general deep learning network for the self-supervised learning of three modalities, particularly, using sign language embeddings together with audio or audio-visual information for better ...
[27] used convolutional neural network (CNN) to predict time-frequency masking for speech separation and Fu et al. ...
arXiv:2007.10629v1
fatcat:umggllkdofcjtc5ub6ugc2wpkq
Learning to Separate Object Sounds by Watching Unlabeled Video
[article]
2018
arXiv
pre-print
We show how the recovered disentangled bases can be used to guide audio source separation to obtain better-separated, object-level sounds. ...
Our work is the first to learn audio source separation from large-scale "in the wild" videos containing multiple audio sources per video. ...
In contrast, our goal is to separate multiple audio sources from a monoaural signal by leveraging learned audio-visual associations. ...
arXiv:1804.01665v2
fatcat:epqocpix6fhr5hwcamphkwaqsy
CNN-Based Acoustic Scene Classification System
2021
Electronics
Second, using the same feature, depthwise separable convolution was applied to the Convolutional layer to develop a low-complexity model. ...
One is that the audio recorded using different recording devices should be classified in general, and the other is that the model used should have low-complexity. ...
The audio is provided in a binaural 48 kHz 24-bit format. Convolutional neural networks (CNNs) are deep neural networks that are commonly used for visual image analysis. ...
doi:10.3390/electronics10040371
fatcat:pgzg4pac4vgzngvmen4qmv6pkq
« Previous
Showing results 1 — 15 out of 40 results