Filters








917 Hits in 8.4 sec

2020 Index IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 28

2020 IEEE/ACM Transactions on Audio Speech and Language Processing  
., +, TASLP 2020 2461-2475 Microphone Array Wiener Post Filtering Using Monotone Operator Splitting.  ...  ., +, TASLP 2020 1065-1078 Microphone Array Wiener Post Filtering Using Monotone Operator Split- ting.  ...  T Target tracking Multi-Hypothesis Square-Root Cubature Kalman Particle Filter for Speaker Tracking in Noisy and Reverberant Environments. Zhang, Q., +, TASLP 2020 1183 -1197  ... 
doi:10.1109/taslp.2021.3055391 fatcat:7vmstynfqvaprgz6qy3ekinkt4

Table of Contents

2020 IEEE/ACM Transactions on Audio Speech and Language Processing  
Mitianoudis 2025 Microphone Array Wiener Post Filtering Using Monotone Operator Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ...  Sarti 1755 Speech Enhancement Using Masking for Binaural Reproduction of Ambisonics Signals . . . . . . M. Lugasi and B.  ... 
doi:10.1109/taslp.2020.3046148 fatcat:hirdphjf6zeqdjzwnwlwlamtb4

Supervised Speech Separation Based on Deep Learning: An Overview [article]

DeLiang Wang, Jitong Chen
2018 arXiv   pre-print
Then we discuss three main components of supervised separation: learning machines, training targets, and acoustic features.  ...  A more recent approach formulates speech separation as a supervised learning problem, where the discriminative patterns of speech, speakers, and background noise are learned from training data.  ...  We thank Masood Delfarah for help in manuscript preparation and Jun Du, Yu Tsao, Yuxuan Wang, Yong Xu, and Xueliang Zhang for helpful comments on an earlier version.  ... 
arXiv:1708.07524v2 fatcat:bvaa2yuppffppnta2lfpkk4v4m

Distant speech separation using predicted time–frequency masks from spatial features

Pasi Pertilä, Joonas Nikunen
2015 Speech Communication  
Microphone arrays have been long studied for processing of distant speech. This work uses a feed-forward neural network for mapping microphone array's spatial features into a T-F mask.  ...  Wiener filter is used as a desired mask for training the neural network using speech examples in simulated setting.  ...  Acknowledgments The corresponding author wishes to thank Finnish Academy project no. 138803 for its role in funding the research.  ... 
doi:10.1016/j.specom.2015.01.006 fatcat:uugqlfwwezaxvpmvmpr5guksbm

Table of Contents

2021 IEEE/ACM Transactions on Audio Speech and Language Processing  
Qu Binaural Auralization of Microphone Array Room Impulse Responses Using Causal Wiener Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ...  Beliakov Signal Enhancement and Restoration Multi-Metric Optimization Using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement . . . . . . . . . . . . . . . . . . . . . . .  ...  Speech Enhancement and Separation  ... 
doi:10.1109/taslp.2021.3137066 fatcat:ocit27xwlbagtjdyc652yws4xa

ODAS: Open embeddeD Audition System [article]

François Grondin, Dominic Létourneau, Cédric Godin, Jean-Samuel Lauzon, Jonathan Vincent, Simon Michaud, Samuel Faucher, François Michaud
2022 arXiv   pre-print
Artificial audition aims at providing hearing capabilities to machines, computers and robots.  ...  It presents key features of ODAS, along with cases illustrating its uses in different robots and artificial audition applications.  ...  FM supervised and led the team. FUNDING This work was supported by FRQNT -Fonds recherche Québec Nature et Technologie.  ... 
arXiv:2103.03954v2 fatcat:23uao44psrckzf62z7nwambmha

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments [article]

Zixing Zhang, Jürgen Geiger, Jouni Pohjalainen, Amr El-Desoky Mousa, Wenyu Jin, Björn Schuller
2018 arXiv   pre-print
In this light, we review recently developed, representative deep learning approaches for tackling non-stationary additive and convolutional degradation of speech with the aim of providing guidelines for  ...  Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge.  ...  The beamformer output is often further enhanced by a microphone array post-filter [119] , [120] .  ... 
arXiv:1705.10874v3 fatcat:evdhqnj7eraa5jiolakuf4mf3e

Table of Contents [EDICS]

2020 IEEE/ACM Transactions on Audio Speech and Language Processing  
Ono 503 Microphone Array Wiener Post Filtering Using Monotone Operator Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ...  Saruwatari1948 End-to-End Post-Filter for Speech Separation With Deep Attention Fusion Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ... 
doi:10.1109/taslp.2020.3046150 fatcat:easrxuwl6zdppejsrf4bskxfw4

Multi-Channel Speech Denoising for Machine Ears [article]

Cong Han, E. Merve Kaya, Kyle Hoefer, Malcolm Slaney, Simon Carlile
2022 arXiv   pre-print
This work describes a speech denoising system for machine ears that aims to improve speech intelligibility and the overall listening experience in noisy environments.  ...  We recorded approximately 100 hours of audio data with reverberation and moderate environmental noise using a pair of microphone arrays placed around each of the two ears and then mixed sound recordings  ...  To reduce the residual noise, the beamforming output can be filtered by the mask used for beamforming [14] or can be processed by a new post enhancement neural network [17] or even more iterations  ... 
arXiv:2202.08793v1 fatcat:mecjvl63o5g2fli5jthtgaqkee

ODAS: Open embeddeD Audition System

François Grondin, Dominic Létourneau, Cédric Godin, Jean-Samuel Lauzon, Jonathan Vincent, Simon Michaud, Samuel Faucher, François Michaud
2022 Frontiers in Robotics and AI  
Artificial audition aims at providing hearing capabilities to machines, computers and robots.  ...  It presents key features of ODAS, along with cases illustrating its uses in different robots and artificial audition applications.  ...  Increasing the microphone array aperture provides better discrimination in low-frequencies, which is well-suited for speech; 3) using a directivity model for closed arrays with microphones installed on  ... 
doi:10.3389/frobt.2022.854444 pmid:35634264 pmcid:PMC9131248 fatcat:hlz7mpq3xvcezgbw277ck2fhf4

Auditory System for a Mobile Robot [article]

Jean-Marc Valin
2016 arXiv   pre-print
We demonstrate that it is possible to implement these capabilities using an array of microphones, without trying to imitate the human auditory system.  ...  Separation of simultaneous sound sources is achieved using a variant of the Geometric Source Separation (GSS) algorithm, combined with a multi-source post-filter that further reduces noise, interference  ...  In this case, the Q-learning unsupervised learning algorithm is used instead of supervised learning, which is most commonly used in the field of speech recognition.  ... 
arXiv:1602.06652v1 fatcat:o2df2aufl5hkln37rafn2cpr2q

Block-Online Multi-Channel Speech Enhancement Using DNN-Supported Relative Transfer Function Estimates [article]

Jiri Malek, Zbynek Koldovsky, Marek Bohac
2019 arXiv   pre-print
This work addresses the problem of block-online processing for multi-channel speech enhancement.  ...  Moreover, word error rate (WER) achieved by a baseline automatic speech recognition system is evaluated, for which the enhancement method serves as a front-end solution.  ...  do not use post-filtering.  ... 
arXiv:1905.03632v3 fatcat:iqbvzk2r7jhxpoxmbozimnbckq

Audiovisual Information Fusion in Human–Computer Interfaces and Intelligent Environments: A Survey

Shankar T. Shivappa, Mohan Manubhai Trivedi, Bhaskar D. Rao
2010 Proceedings of the IEEE  
Microphones and cameras have been extensively used to observe and detect human activity and to facilitate natural modes of interaction between humans and intelligent systems.  ...  In this paper we describe the fusion strategies and the corresponding models used in audiovisual tasks such as speech recognition, tracking, biometrics, affective state recognition and meeting scene analysis  ...  We sincerely thank the reviewers for their valuable advise which has helped us enhance the content as well as the presentation of the paper.  ... 
doi:10.1109/jproc.2010.2057231 fatcat:lfzgfmn2hjdq7h6o5txva3oapq

Mic2Mic

Akhil Mathur, Anton Isopoussu, Fahim Kawsar, Nadia Berthouze, Nicholas D. Lane
2019 Proceedings of the 18th International Conference on Information Processing in Sensor Networks - IPSN '19  
In this work, we propose Mic2Mic -- a machine-learned system component -- which resides in the inference pipeline of audio models and at real-time reduces the variability in audio data caused by microphone-specific  ...  With these in mind, we apply the principles of cycle-consistent generative adversarial networks (CycleGANs) to learn Mic2Mic using unlabeled and unpaired data collected from different microphones.  ...  Only recently, there has been an increased focus on using GANs for speech enhancement and speech generation.  ... 
doi:10.1145/3302506.3310398 dblp:conf/ipsn/MathurIKBL19 fatcat:gdr6htyeczaqlirpsto757grly

Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation [article]

Zhong-Qiu Wang and Gordon Wichern and Jonathan Le Roux
2021 arXiv   pre-print
A promising approach for speech dereverberation is based on supervised learning, where a deep neural network (DNN) is trained to predict the direct sound from noisy-reverberant speech.  ...  In this work, we propose to exploit this linear-filter structure within a deep learning based monaural speech dereverberation framework.  ...  For with beamforming and post-filtering.  ... 
arXiv:2108.07376v2 fatcat:3obh4vconnfqrmez5fzp37qc6y
« Previous Showing results 1 — 15 out of 917 results