Filters








242 Hits in 3.1 sec

Deep attractor network for single-microphone speaker separation

Zhuo Chen, Yi Luo, Nima Mesgarani
2017 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
We propose a novel deep learning framework for single channel speech separation by creating attractor points in high dimensional embedding space of the acoustic signals which pull together the time-frequency  ...  Despite the overwhelming success of deep learning in various speech processing tasks, the problem of separating simultaneous speakers in a mixture remains challenging.  ...  John Hershey and Jonathan Le Roux of Mitsubishi Electric Research Lab for constructive discussions.  ... 
doi:10.1109/icassp.2017.7952155 pmid:29430212 pmcid:PMC5805382 fatcat:u5k3uzdizjdmrmfexz3nbld5mi

Cracking the cocktail party problem by multi-beam deep attractor network [article]

Zhuo Chen, Jinyu Li, Xiong Xiao, Takuya Yoshioka, Huaming Wang, Zhenghao Wang, Yifan Gong
2018 arXiv   pre-print
Then each beamformed signal is fed into a single-channel anchored deep attractor network to generate separated signals.  ...  While recent progresses in neural network approaches to single-channel speech separation, or more generally the cocktail party problem, achieved significant improvement, their performance for complex mixtures  ...  deep attractor network.  ... 
arXiv:1803.10924v1 fatcat:tfijy4ujn5cjxjcuga73mhcvci

Cracking the cocktail party problem by multi-beam deep attractor network

Zhuo Chen, Jinyu Li, Xiong Xiao, Takuya Yoshioka, Huaming Wang, Zhenghao Wang, Yifan Gong
2017 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)  
Then each beamformed signal is fed into a single-channel anchored deep attractor network to generate separated signals.  ...  While recent progresses in neural network approaches to singlechannel speech separation, or more generally the cocktail party problem, achieved significant improvement, their performance for complex mixtures  ...  deep attractor network.  ... 
doi:10.1109/asru.2017.8268969 dblp:conf/asru/ChenLXYWWG17 fatcat:wvygddpmu5cabgcdtk2eh3psjm

Speaker Identification using Machine Learning

2019 VOLUME-8 ISSUE-10, AUGUST 2019, REGULAR ISSUE  
Whatever the modern achievement of deep learning for several terminology processing tasks, single-microphone, speaker-independent speech separation remains difficult for just two main things.  ...  a speaker using a blend of speakers together with the aid of neural networks employing deep learning.  ...  RELATED WORK Deep Attractor Network for unmarried Microphone Speaker Separation A publication deep learning frame for only channel speech adjustment by producing attractor points in high dimensional embed  ... 
doi:10.35940/ijitee.l3179.119119 fatcat:mkcuxoc7qbgnxibufqxhqztqqu

Speaker-Independent Speech Separation With Deep Attractor Network

Yi Luo, Zhuo Chen, Nima Mesgarani
2018 IEEE/ACM Transactions on Audio Speech and Language Processing  
Despite the recent success of deep learning for many speech processing tasks, single-microphone, speaker-independent speech separation remains challenging for two main reasons.  ...  We propose a novel deep learning framework for speech separation that addresses both of these issues.  ...  Dong Yu of Tencent AI Lab for constructive discussions. Yi Luo and Zhuo Chen contributed equally to this work.  ... 
doi:10.1109/taslp.2018.2795749 fatcat:kyznh4g3orgudiwfepgv7v6u3q

Integrating Spectral and Spatial Features for Multi-Channel Speaker Separation

Zhong-Qiu Wang, DeLiang Wang
2018 Interspeech 2018  
This paper tightly integrates spectral and spatial information for deep learning based multi-channel speaker separation.  ...  The key idea is to localize individual speakers so that an enhancement network can be used to separate the speaker from an estimated direction and with specific spectral characteristics.  ...  Recent studies [18] , [19] apply single-channel deep clustering on each microphone signal to derive a T-F masking based beamformer for each source for separation.  ... 
doi:10.21437/interspeech.2018-1940 dblp:conf/interspeech/WangW18 fatcat:lataq7hgebdhzbwhitx7oabdzm

Exploring the time-domain deep attractor network with two-stream architectures in a reverberant environment [article]

Hangting Chen, Pengyuan Zhang
2021 arXiv   pre-print
Deep attractor networks (DANs) perform speech separation with discriminative embeddings and speaker attractors.  ...  and separation tasks under the condition of a variable number of speakers.  ...  Speaker-independent speech separation with deep attractor network. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 26, 787-796. Luo, Y., & Mesgarani, N. (2018). Tasnet  ... 
arXiv:2007.00272v4 fatcat:3teqp2wc5zdefgt5z63bteifmi

Efficient Integration of Multi-channel Information for Speaker-independent Speech Separation [article]

Yuichiro Koyama, Oluwafemi Azeez, Bhiksha Raj
2020 arXiv   pre-print
Although deep-learning-based methods have markedly improved the performance of speech separation over the past few years, it remains an open question how to integrate multi-channel signals for speech separation  ...  We propose two methods, namely, early-fusion and late-fusion methods, to integrate multi-channel information based on the time-domain audio separation network, which has been proven effective in single-channel  ...  scene analysis (CASA)-based approaches [16] , and the deep attractor network [17] , have achieved a high level of success.  ... 
arXiv:2005.11612v2 fatcat:u2pb2daeuvd63dail4n5br2mru

Monaural Audio Speaker Separation with Source Contrastive Estimation [article]

Cory Stephenson, Patrick Callier, Abhinav Ganesh, Karl Ni
2017 arXiv   pre-print
We propose an algorithm to separate simultaneously speaking persons from each other, the "cocktail party problem", using a single microphone.  ...  Our approach involves a deep recurrent neural networks regression to a vector space that is descriptive of independent speakers.  ...  DC is related to another approach, deep attractor networks (DA) [12] .  ... 
arXiv:1705.04662v1 fatcat:xb5au2ofknambjmp5kxrkbkhne

Integration of neural networks and probabilistic spatial models for acoustic blind source separation

Lukas Drude, Reinhold Haeb-Umbach
2019 IEEE Journal on Selected Topics in Signal Processing  
We formulate a generic framework for blind source separation (BSS), which allows integrating data-driven spectrotemporal methods, such as deep clustering and deep attractor networks, with physically motivated  ...  student neural network.  ...  Deep clustering DC is a technique which aims to blindly separate unseen speakers in a single-channel mixture.  ... 
doi:10.1109/jstsp.2019.2912565 fatcat:brneboukgneg3npnuqx4phgsom

Recognizing Multi-talker Speech with Permutation Invariant Training [article]

Dong Yu, Xuankai Chang, Yanmin Qian
2017 arXiv   pre-print
In this paper, we propose a novel technique for direct recognition of multiple speech streams given the single channel of mixed speech, without first separating them.  ...  PIT-ASR forces all the frames of the same speaker to be aligned with the same output layer. This strategy elegantly solves the label permutation problem and speaker tracing problem in one shot.  ...  ., speech separation and recognition are two separate components. Chen et al. [26] proposed a similar technique called deep attractor network (DANet).  ... 
arXiv:1704.01985v4 fatcat:2h4y2kkosbf6jaymago7lhm6mi

Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition [article]

Sunit Sivasankaran, Emmaneul Vincent, Dominique Fohr
2019 arXiv   pre-print
speaker using a neural network.  ...  Given the speaker location information, speech separation is performed in three stages.  ...  Single-channel approaches include clustering-based methods such as deep clustering [3] and deep attractor networks [4] where a neural network is trained to cluster together the time-frequency bins  ... 
arXiv:1910.11114v1 fatcat:hguauilxtzbqboobfksmxoidne

Efficient Integration of Fixed Beamformers and Speech Separation Networks for Multi-Channel Far-Field Speech Separation

Zhuo Chen, Takuya Yoshioka, Xiong Xiao, Linyu Li, Michael L. Seltzer, Yifan Gong
2018 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
The beam prediction network takes in the beamformed audio signals and estimates the best beam for each speaker constituting the input mixture.  ...  Speech separation research has significantly progressed in recent years thanks to the rapid advances in deep learning technology.  ...  Deep clustering (DC) [2, 3] and deep attractor networks [4] are two representative embedding-based methods.  ... 
doi:10.1109/icassp.2018.8461930 dblp:conf/icassp/ChenYXLSG18 fatcat:tyvbcjdupzb7vn4hpqhwk4g7me

Speaker-independent auditory attention decoding without access to clean speech sources

Cong Han, James O'Sullivan, Yi Luo, Jose Herrero, Ashesh D. Mehta, Nima Mesgarani
2019 Science Advances  
We utilize a novel speech separation algorithm to automatically separate speakers in mixed audio, with no need for the speakers to have prior training.  ...  Our results show that auditory attention decoding with automatically separated speakers is as accurate and fast as using clean speech sounds.  ...  One such approach is the deep attractor network [DAN; (10, 11) ].  ... 
doi:10.1126/sciadv.aav6134 pmid:31106271 pmcid:PMC6520028 fatcat:q75aswckhzduraem7qjoncbawi

Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information

Rongzhi Gu, Lianwu Chen, Shi-Xiong Zhang, Jimeng Zheng, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu
2019 Interspeech 2019  
direction, for target speaker separation.  ...  The recent exploration of deep learning for supervised speech separation has significantly accelerated the progress on the multi-talker speech separation problem.  ...  We also trained a single target speaker network to separate the speaker of interest.  ... 
doi:10.21437/interspeech.2019-2266 dblp:conf/interspeech/GuCZZXYSZ019 fatcat:ebrxte7o2fhvzdoybevt57dpvm
« Previous Showing results 1 — 15 out of 242 results