A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
LaFurca: Iterative Refined Speech Separation Based on Context-Aware Dual-Path Parallel Bi-LSTM
[article]
2020
arXiv
pre-print
In this paper, we propose several improvements of dual-path BiLSTM based network for end-to-end approach to monaural speech separation. ...
Deep neural network with dual-path bi-directional long short-term memory (BiLSTM) block has been proved to be very effective in sequence modeling, especially in speech separation, e.g. DPRNN-TasNet . ...
The remainder of this paper is organized as follows: section 2 introduces end-to-end monaural speech separation based on deep neural networks with dual-path BiLSTM blocks. ...
arXiv:2001.08998v4
fatcat:2e36uxzugjgcdo47jsg4qe2ptm
Speech Separation Using Convolutional Neural Network and Attention Mechanism
2020
Discrete Dynamics in Nature and Society
This paper proposes a speech separation model based on convolutional neural networks and attention mechanism. ...
Compared to the typical speech separation model DRNN-2 + discrim, this method achieves 0.27 dB GNSDR gain and 0.51 dB GSIR gain, which illustrates that the speech separation model proposed in this paper ...
methods such as model-based methods and speech enhancement methods. (2) Newer methods using DNNs (Deep Neural Networks). ...
doi:10.1155/2020/2196893
doaj:c5b71d75a44b42daaa0de1388bf01d6b
fatcat:7nm7niv3trfcjpxfm5x4huwvoy
Multi-Microphone Complex Spectral Mapping for Speech Dereverberation
[article]
2020
arXiv
pre-print
This study proposes a multi-microphone complex spectral mapping approach for speech dereverberation on a fixed array geometry. ...
Experimental results on multi-channel speech dereverberation demonstrate the effectiveness of the proposed approach. ...
on monaural dereverberation. ! ...
arXiv:2003.01861v1
fatcat:7yaudakexrddzcyq5gpbm6flka
A Two-Stage Phase-Aware Approach for Monaural Multi-Talker Speech Separation
2020
IEICE transactions on information and systems
Recently, deep neural networks have dramatically improved the speech separation performance. ...
The study implements the MISI algorithm based on the mask and gives that the ideal amplitude mask (IAM) is the optimal mask for the mask-based MISI phase recovery, which brings less phase distortion. ...
In recent years, neural network-based speech separation has attracted increasing attention. ...
doi:10.1587/transinf.2019edp7259
fatcat:tdksupmtszh2hfm7qs5zhrce2a
Integrating Spectral and Spatial Features for Multi-Channel Speaker Separation
2018
Interspeech 2018
Strong separation performance has been observed on a spatialized reverberant version of the wsj0-2mix corpus. ...
This paper tightly integrates spectral and spatial information for deep learning based multi-channel speaker separation. ...
We introduce two types of directional features, one based on compensating IPDs and the other based on beamforming. ...
doi:10.21437/interspeech.2018-1940
dblp:conf/interspeech/WangW18
fatcat:lataq7hgebdhzbwhitx7oabdzm
Monaural Multi-Talker Speech Recognition using Factorial Speech Processing Models
[article]
2016
arXiv
pre-print
The proposed method of the IBM team, consist of an intermediate speech separation and then a single-talker speech recognition. ...
This paper reconsiders the task of this challenge based on gain adapted factorial speech processing models. ...
The method presented in this paper is a model based approach based on factorial speech processing models for recognizing monaural mixed-speech signals which is applied for the "Monaural speech separation ...
arXiv:1610.01367v1
fatcat:rlcka7fkrzafbjk2bfhppewm6i
Deep Bayesian Unsupervised Source Separation Based on a Complex Gaussian Mixture Model
[article]
2019
arXiv
pre-print
In addition, the pre-trained network can be used not only for conducting monaural separation but also for efficiently initializing a multichannel separation algorithm. ...
The proposed method uses a cost function based on a spatial model called a complex Gaussian mixture model (cGMM). ...
[14] trained a monaural separation network by using source signals estimated by applying K-means clustering on interchannel phase differences (IPDs) between two microphones. ...
arXiv:1908.11307v1
fatcat:34gjxbsexbhynie7whhoqzpmqu
Knowledge Distillation for End-to-End Monaural Multi-Talker ASR System
2019
Interspeech 2019
The proposed methods are evaluated on two-speaker mixed speech generated from the WSJ0 corpus, which is commonly used for this task recently. ...
End-to-end models for monaural multi-speaker automatic speech recognition (ASR) have become an important and interesting approach when dealing with the multi-talker mixed speech under cocktail party scenario ...
Experiments have been carried out on the PI supercomputer at Shanghai Jiao Tong University. ...
doi:10.21437/interspeech.2019-3192
dblp:conf/interspeech/ZhangCQ19
fatcat:yfdpkyqngff4hfvukclj5blf44
Feature Joint-State Posterior Estimation in Factorial Speech Processing Models using Deep Neural Networks
[article]
2017
arXiv
pre-print
The experiments compare the proposed network decoding results to those of the vector Taylor series method and show 2.3% absolute performance improvement in the monaural speech separation and recognition ...
This paper proposes a new method for calculating joint-state posteriors of mixed-audio features using deep neural networks to be used in factorial speech processing models. ...
Based on this assumption, we propose the following three steps for training a deep neural network for extracting joint-state posteriors: the generative phase, initializing joint-state layer weights, and ...
arXiv:1707.02661v1
fatcat:fbdytm5mkbepfn2gxbu56jxlle
Temporal-Spatial Neural Filter: Direction Informed End-to-End Multi-channel Target Speech Separation
[article]
2020
arXiv
pre-print
Despite the recent advances in deep learning based close-talk speech separation, the applications to real-world are still an open issue. ...
Target speech separation refers to extracting the target speaker's speech from mixed signals. ...
learning-based MSS methods, including Freq-BLSTM based speech separation methods, multi-channel deep clustering (DC) [40] and neural spatial filter [75] . ...
arXiv:2001.00391v1
fatcat:bb33mmziofhfzd673ytisr4dwy
Iterative Deep Neural Networks for Speaker-Independent Binaural Blind Speech Separation
2018
2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
In this paper, we propose an iterative deep neural network (DNN)-based binaural source separation scheme, for recovering two concurrent speech signals in a room environment. ...
Index Terms-Deep neural network, binaural blind speech separation, spectral and spatial, iterative DNN ...
INTRODUCTION Deep neural networks (DNN) [1] have recently been exploited in the field of blind source separation [2] , e.g., to extract target speech corrupted by background noise [3] [4] [5] [6] [ ...
doi:10.1109/icassp.2018.8462603
dblp:conf/icassp/Liu0JWC18
fatcat:gjlmw2uwajbq3g4o2wre5fsedi
Binaural speaker identification using the equalization-cancelation technique
2020
EURASIP Journal on Audio, Speech, and Music Processing
Simulation results show the superiority of the proposed method in all experimental conditions. ...
The equalization-cancelation algorithm is employed to enhance the input test speech and alleviate the detrimental effects of noise and reverberation in the speaker identification system. ...
As one of the binaural speech segregation methods, the mask is estimated by employing a deep neural network (DNN) classification method [47, 48] . ...
doi:10.1186/s13636-020-00188-y
fatcat:uo65ddsjdbebzm5luwcr6ypt3q
Table of Contents
2021
IEEE/ACM Transactions on Audio Speech and Language Processing
Mesgarani Time-Domain Audio Source Separation With Neural Networks Based on Multiresolution Analysis . . . . . . . . . . . . . . . ....Saruwatari Conditioned Source Separation for Musical Instrument Performances ...
. . . . . . . . . . . . . . . . ....Wang Monaural Speech Separation Using Speaker Embedding From Preliminary Separation . . . . . . ....J. ...
Speech Enhancement and Separation ...
doi:10.1109/taslp.2021.3137066
fatcat:ocit27xwlbagtjdyc652yws4xa
Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation
[article]
2021
arXiv
pre-print
Assuming a fixed array geometry between training and testing, we train deep neural networks (DNN) to predict the real and imaginary (RI) components of target speech at a reference microphone from the RI ...
Although our system is trained on simulated room impulse responses (RIR) based on a fixed number of microphones arranged in a given geometry, it generalizes well to a real array with the same geometry. ...
using a convolutional encoder-decoder neural network (see Figure 4 ). ...
arXiv:2010.01703v2
fatcat:huvvxizr2jhjlhugtwk4kr7kze
Sequential Multi-Frame Neural Beamforming for Speech Separation and Enhancement
[article]
2020
arXiv
pre-print
This work introduces sequential neural beamforming, which alternates between neural network based spectral separation and beamforming based spatial separation. ...
Our neural networks for separation use an advanced convolutional architecture trained with a novel stabilized signal-to-noise ratio loss function. ...
Abstract This work introduces sequential neural beamforming, which alternates between neural network based spectral separation and beamforming based spatial separation. ...
arXiv:1911.07953v3
fatcat:ruylaknm6jftzamouvkfx4akza
« Previous
Showing results 1 — 15 out of 421 results