253 Hits in 4.2 sec

Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition

Jort F. Gemmeke, Tuomas Virtanen, Antti Hurmalainen
2011 IEEE Transactions on Audio, Speech, and Language Processing  
This paper proposes to use exemplar-based sparse representations for noise robust automatic speech recognition.  ...  Index Terms-Exemplar-based, noise robustness, non-negative matrix factorization, sparse representations, speech recognition.  ...  His research interests include content analysis of audio signals, sound source separation, noise-robust automatic speech recognition, and machine learning.  ... 
doi:10.1109/tasl.2011.2112350 fatcat:jqy52cczlzbcvjzcipsv4mpdoy

Noise-Robust Speech Recognition Through Auditory Feature Detection and Spike Sequence Decoding

Phillip B. Schafer, Dezhe Z. Jin
2014 Neural Computation  
Both the spike-based encoding scheme and the template-based decoding offer gains in noise robustness over traditional speech recognition methods.  ...  We present a system for noise-robust isolated word recognition that works by decoding sequences of spikes from a population of simulated auditory feature-detecting neurons.  ...  Jason Wittenbach and Sumithra Surendralal gave useful comments on the manuscript.  ... 
doi:10.1162/neco_a_00557 pmid:24320849 fatcat:5pn44owf3veanc5lt6x2itqi2y

Modelling non-stationary noise with spectral factorisation in automatic speech recognition

Antti Hurmalainen, Jort F. Gemmeke, Tuomas Virtanen
2013 Computer Speech and Language  
We also propose methods for reducing the size of the bases used for speech and noise modelling by 20-40 times for better practical applicability.  ...  Speech recognition systems intended for everyday use must be able to cope with a large variety of noise types and levels, including highly non-stationary multi-source mixtures.  ...  The final goal of the paper is to present the current state-of-the-art in spectral factorisation based, single-stream noise robust ASR through the use of spectrogram dynamics and binaural features.  ... 
doi:10.1016/j.csl.2012.07.008 fatcat:a6azihuwx5eipmln7ar7wivuaa

Exemplar-based speech enhancement for deep neural network based automatic speech recognition

Deepak Baby, Jort F. Gemmeke, Tuomas Virtanen, Hugo Van hamme
2015 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Deep neural network (DNN) based acoustic modelling has been successfully used for a variety of automatic speech recognition (ASR) tasks, thanks to its ability to learn higher-level information using multiple  ...  This paper investigates the recently proposed exemplar-based speech enhancement technique using coupled dictionaries as a pre-processing stage for DNN-based systems.  ...  INTRODUCTION Automatic speech recognition (ASR) in realistic conditions, where the acoustic data is mixed with a variety of noises and channel variations, is still a major research challenge.  ... 
doi:10.1109/icassp.2015.7178819 dblp:conf/icassp/BabyGVh15 fatcat:y5yn6hjxvzdjtntcyb32jnnl7e

Enhancing the complex-valued acoustic spectrograms in modulation domain for creating noise-robust features in speech recognition

Hsin-Ju Hsieh, Berlin Chen, Jeih-weih Hung
2015 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)  
The corresponding results demonstrate that under the clean-condition training setting, our proposed method can achieve performance competitive to or better than many widely used noise robustness methods  ...  , including the well-known advanced front-end (AFE), in speech recognition.  ...  CONCLUSIONS In this paper, we presented a novel use of PCA for enhancing the complex-valued acoustic spectrograms of speech signals in modulation domain for noise-robust speech recognition.  ... 
doi:10.1109/apsipa.2015.7415526 dblp:conf/apsipa/HsiehCH15 fatcat:6o2yyqv7tvbvzpvhoimqelaxpa

Feature enhancement using sparse reference and estimated soft-mask exemplar-pairs for noisy speech recognition

Lee Ngee Tan, Abeer Alwan
2014 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
A feature enhancement technique for noise-robust speech recognition is proposed.  ...  Existing sparse exemplar-based feature enhancement methods use clean speech and pure noise Mel-spectral exemplars, or clean and noisy speech log-Mel-spectral exemplar-pairs, in their dictionaries.  ...  INTRODUCTION Although good speech recognition performance has been achieved for clean speech, automatic speech recognition (ASR) in noise is still a challenging problem.  ... 
doi:10.1109/icassp.2014.6853890 dblp:conf/icassp/TanA14 fatcat:m5zbr3hxkrdmvfbz7rez4wqdy4

Sparse coding of the modulation spectrum for noise-robust automatic speech recognition

Sara Ahmadi, Seyed Mohammad Ahadi, Bert Cranen, Lou Boves
2014 EURASIP Journal on Audio, Speech, and Music Processing  
Most previous research in automatic speech recognition converted this very rich representation into the equivalent of a sequence of short-time power spectra, mainly to simplify the computation of the posterior  ...  The modulation spectrum analyser uses 15 gammatone filters. The Hilbert envelope of the output of these filters is then processed by nine modulation frequency filters, with bandwidths up to 16 Hz.  ...  We would like to thank Hugo Van hamme and Jort Gemmeke for discussing feature extraction and sparse coding issues and Søren Jørgensen for making available the Matlab code for implementing the modulation  ... 
doi:10.1186/s13636-014-0036-3 fatcat:zo3yo5odnfhrpbrn6bluo2ms2u

Acquiring Variable Length Speech Bases For Factorisation-Based Noise Robust Speech Recognition

Antti Hurmalainen, Tuomas Virtanen
2013 Zenodo  
CONCLUSIONS We proposed methods for acquiring variable-length longcontext speech bases for noise robust speech separation and recognition.  ...  Conventional automatic speech recognition (ASR) systems typically use frames of approximately 25 ms as their features, and Markovian state transition models which only consider temporal context of one  ... 
doi:10.5281/zenodo.43615 fatcat:pzzxg2ozbvexxbb4muwqvgxs2a

Mask estimation and imputation methods for missing data speech recognition in a multisource reverberant environment

Sami Keronen, Heikki Kallasjoki, Ulpu Remes, Guy J. Brown, Jort F. Gemmeke, Kalle J. Palomäki
2013 Computer Speech and Language  
We present an automatic speech recognition system that uses a missing data approach to compensate for challenging environmental noise containing both additive and convolutive components.  ...  To perform speech recognition using the partially observed data, the missing components are substituted with clean speech estimates computed using both sparse imputation and cluster-based GMM imputation  ...  The mask estimation method is evaluated in a missing data reconstruction-based automatic speech recognition task using the cluster-based imputation and sparse imputation methods.  ... 
doi:10.1016/j.csl.2012.06.005 fatcat:x7uuskic5bgqzkj4ixoml6u27e

Nonlinear filtering of spectrotemporal modulations in speech enhancement

Majid Mirbagheri, Nima Mesgarani, Shihab Shamma
2010 2010 IEEE International Conference on Acoustics, Speech and Signal Processing  
A monaural noise-suppression algorithm is proposed that nonlinearly manipulates the spectrotemporal modulations of speech as represented in a model of auditory cortical processing.  ...  A distinctive aspect of this approach is its consideration of the non-stationary dynamic behavior of speech that is captured using nonlinear filters, thus achieving excellent perceptual quality in the  ...  It can also play an important role in automatic speech recognition systems (ASR) by improving their robustness in noisy environments.  ... 
doi:10.1109/icassp.2010.5494912 dblp:conf/icassp/MirbagheriMS10 fatcat:qcqm7cvs5zc5dkipp5bl6ljpfq

Hybrid input spaces for exemplar-based noise robust speech recognition using coupled dictionaries

Deepak Baby, Hugo Van hamme
2015 2015 23rd European Signal Processing Conference (EUSIPCO)  
Exemplar-based feature enhancement successfully exploits a wide temporal signal context.  ...  When compared to the system which uses Mel features only as input exemplars, these hybrid input spaces are found to yield improved word error rates on the AURORA-2 database especially with unseen noise  ...  useful for noise-robust ASR [3] [4] [5] .  ... 
doi:10.1109/eusipco.2015.7362669 dblp:conf/eusipco/Babyh15 fatcat:l2csi2xmnzbvvc4eo5x7d6v5m4

A dynamical pattern recognition model of gamma activity in auditory cortex

M. Zavaglia, R.T. Canolty, T.M. Schofield, A.P. Leff, M. Ursino, R.T. Knight, W.D. Penny
2012 Neural Networks  
At an algorithmic level, recognition is based on the use of Occurrence Time features.  ...  Using a speech digit database we show that for noisy recognition environments, these features rival standard cepstral coefficient features.  ...  First, we consider the algorithmic level and use a speech database to assess the usefulness of OT features as compared to standard features used in Automatic Speech Recognition (ASR) that are based on  ... 
doi:10.1016/j.neunet.2011.12.007 pmid:22327049 pmcid:PMC3314972 fatcat:hqi2tvxcq5cjfivzvwako6dd7q

Noise robust exemplar matching with coupled dictionaries for single-channel speech enhancement

Emre Yilmaz, Deepak Baby, Hugo Van hamme
2015 2015 23rd European Signal Processing Conference (EUSIPCO)  
In this paper, we propose a single-channel speech enhancement system based on the noise robust exemplar matching (N-REM) framework using coupled dictionaries.  ...  N-REM approximates noisy speech segments as a sparse linear combination of speech and noise exemplars that are stored in multiple dictionaries based on their length and associated speech unit.  ...  magnitude spectral features with perceptually motivated modulation spectrogram features.  ... 
doi:10.1109/eusipco.2015.7362508 dblp:conf/eusipco/YilmazBh15 fatcat:wvylu4f64zfw3mqjibm35dr45e

Voice conversion and spoofing attack on speaker verification systems

Zhizheng Wu, Haizhou Li
2013 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference  
We design speaker verification system to automatically accept or reject the claimed identity of a speaker.  ...  The major concern when deploying speaker verification technology is whether a system is robust against spoofing attacks.  ...  Usually, automatic speech recognition is required to extract high level features. Spectro-temporal features involve prosodic, temporal modulation features, etc.  ... 
doi:10.1109/apsipa.2013.6694344 dblp:conf/apsipa/WuL13 fatcat:epl5drbs3bhypjvcatoajhnvxm

A new approach for classification of dolphin whistles

Mahdi Esfahanian, Hanqi Zhuang, Nurgun Erdol
2014 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Most accurate methods to identify dolphin whistles are tedious and not robust, especially in the presence of ocean noise.  ...  The classifier uses the 1 -norm to select a match.  ...  For instance, 2-D spectrotemporal filters have been employed on Mel-scale spectrogram patterns for automatic speech recognition using the SRC technique resulting in an impressive decrease of 28% in word  ... 
doi:10.1109/icassp.2014.6854763 dblp:conf/icassp/EsfahanianZE14 fatcat:6vi32qyyaratdgeb7nqsewxnna
« Previous Showing results 1 — 15 out of 253 results