A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2016; you can also visit the original URL.
The file type is application/pdf
.
Filters
Combination of two-dimensional cochleogram and spectrogram features for deep learning-based ASR
2015
2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Performance was evaluated in the framework of hybrid neural network -hidden Markov model (NN-HMM) system on TIMIT phoneme sequence recognition task. ...
The best accuracy was obtained by high-level combination of two dimensional cochleogram-spectrogram features using CNN, achieved up to 8.2% relative phoneme error rate (PER) reduction from CNN single features ...
Then, we construct hybrid DNN-HMM and CNN-HMM systems for phoneme sequence recognition task. Here the HMM tied triphone states are used as the neural network target class. ...
doi:10.1109/icassp.2015.7178827
dblp:conf/icassp/TjandraSNTAN15
fatcat:rwpcq7xdwnenlhlf6rtoyaseiy
A Novel Noise Immune, Fuzzy Approach to Speaker Independent, Isolated Word Speech Recognition
2006
2006 World Automation Congress
The task is based on conversion of speech spectrogram into a linguistic fuzzy description and comparison of this representation with similar linguistic descriptions of words. ...
The method is tested and compared with a widely used speech recognition approach and has shown a significant higher robustness versus noise. ...
in compare with word recognition using phonemes, resulting in more possible discrimination. ...
doi:10.1109/wac.2006.376025
fatcat:7ucs5lkdbnhxzg3vlk4ckw5zjq
Modelling human speech recognition in challenging noise maskers using machine learning
2020
Acoustical Science and Technology
signal (as is the case for most auditory model-based SRT predictions). ...
in modulated noise, it is shown that the DNN is listening in the dips. ...
The algorithm traces back the activations from the output (phoneme) layer to the input (spectrogram) layer. ...
doi:10.1250/ast.41.94
fatcat:uqf7hko6tjdc3ah4w4mcvh723e
A Multistream Feature Framework Based on Bandpass Modulation Filtering for Robust Speech Recognition
2013
IEEE Transactions on Audio, Speech, and Language Processing
We adapt such duality in a multistream framework for robust speaker-independent phoneme recognition. ...
The proposed architecture results in substantial improvements over standard and state-of-the-art feature schemes for phoneme recognition, particularly in presence of nonstationary noise, reverberation ...
Note that the hybrid HMM/MLP framework overcomes some of the limitations of the standard HMM/GMM systems [32] and achieves better phoneme recognition performance [34] in addition to having advantages ...
doi:10.1109/tasl.2012.2219526
pmid:29928166
pmcid:PMC6005699
fatcat:vejxooppfbamtgekhp4ulpfxqy
Development of Hausa Acoustic Model for Speech Recognition
2022
International Journal of Advanced Computer Science and Applications
In this regard, this research is concerned with the development of the Hausa acoustic model for automatic speech recognition. ...
This is done by creating a word-level phonemes dataset from the Hausa speech corpus database. Then implement a deep learning algorithm for acoustic modeling. ...
To see how successfully the model classified each auditory word in the test set, a confusion matrix was plotted as displayed in Fig. 8 . ...
doi:10.14569/ijacsa.2022.0130559
fatcat:v7hbdf6pi5d5lizfsvsytt6mkq
On the relevance of auditory-based Gabor features for deep learning in robust speech recognition
2017
Computer Speech and Language
Previous studies support the idea of merging auditory-based Gabor features with deep learning architectures to achieve robust automatic speech recognition, however, the cause behind the gain of such combination ...
To explain the results, a measure of similarity between phoneme classes from DNN activations is proposed and linked to their acoustic properties. ...
Acknowledgment This work was funded by the DFG (Cluster of Excellence 1077/1 Hearing4All (http://hearing4all.eu), and the SFB/TRR 31 "The Active Auditory System" (http://www.sfb-trr31.unioldenburg.de/) ...
doi:10.1016/j.csl.2017.02.006
fatcat:7fdlydj3bja5hlo4ttdm2sf2ta
Recognition of human speech phonemes using a novel fuzzy approach
2007
Applied Soft Computing
To do so, the speech spectrogram is converted into a fuzzy linguistic description and this description is used instead of precise acoustic features. ...
or noise robust recognition. ...
The benchmark system is an HMM based isolated phoneme recognition system with MFCC features [1, 7, 21, 22] . ...
doi:10.1016/j.asoc.2006.02.007
fatcat:7aju4mua7rd6foolb2raz26suy
Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition
2012
International Journal of Speech Technology
However most speech processing systems, like automatic speech and speaker recognition systems, suffer from a significant drop in performance when speech signals are corrupted with unseen background distortions ...
such as speech and speaker recognition. ...
Parts of this analysis have been presented in (Nemala et al. 2012) . ...
doi:10.1007/s10772-012-9184-y
pmid:26412979
pmcid:PMC4579853
fatcat:3pdgp2tmsrbyleuw34vylnu3ay
Using Teager Energy Cepstrum And Hmm Distancesin Automatic Speech Recognition And Analysis Of Unvoiced Speech
2009
Zenodo
In this study, further analysis of the NAM speech has been made using distance measures between hidden Markov model (HMM) pairs. ...
In this study, the use of silicon NAM (Non-Audible Murmur) microphone in automatic speech recognition is presented. ...
Phoneme recognition experiment To evaluate the performance of NAM microphones and investigate the relationship between the HMM distance measures and the phoneme recognition accuracy, a phoneme recognition ...
doi:10.5281/zenodo.1055941
fatcat:ofistliqpfek5hwe23waxwwid4
Real-Time Speech Visualization System :Kannon - Applying Auditory Characteristics
2005
Zenodo
Publication in the conference proceedings of EUSIPCO, Antalya, Turkey, 2005 ...
The TDNN achieved a higher recognition rate in the phoneme recognition than HMM in [5] . ...
To
SPEECH RECOGNITION In previous KanNon system, we have build the speech recognition system using Microsoft speech API which is based on Hidden Markov Model (HMM) in the KanNon system. ...
doi:10.5281/zenodo.39288
fatcat:yottye4vdfgqzllakh6fhr3kni
Investigation of DNN-HMM and Lattice Free Maximum Mutual Information Approaches for Impaired Speech Recognition
2021
IEEE Access
The recognition accuracy is evaluated and compared using two datasets namely 20 acoustically similar words and 50 words Impaired Speech Corpus in Tamil. ...
Impaired speakers have difficulty in pronouncing words which results in partial or incomplete speech contents. ...
A bidirectional Deep Recurrent Neural Network (biRNN) based DNN-HMM is used for phoneme recognition [15] . ...
doi:10.1109/access.2021.3129847
fatcat:t4nx6vf32rdcbpmnduhtlqkp4m
Phoneme recognition using spectral envelope and modulation frequency features
2009
2009 IEEE International Conference on Acoustics, Speech and Signal Processing
These features are combined at the phoneme posterior level and used as features for a hybrid HMM-ANN phoneme recognizer. ...
We present a new feature extraction technique for phoneme recognition that uses short-term spectral envelope and modulation frequency features. ...
In our case, the auditory spectrogram, which is a two-dimensional representation of the input signal, is obtained by stacking the subband temporal envelopes in frequency (similar to the stacking of short-term ...
doi:10.1109/icassp.2009.4960618
dblp:conf/icassp/ThomasGH09
fatcat:ze4m2l5b5bhx3c2mzume5oosiu
Toward optimizing stream fusion in multistream recognition of speech
2011
Journal of the Acoustical Society of America
Results on phoneme recognition from noisy speech indicate the effectiveness of the proposed method. ...
A multistream phoneme recognition framework is proposed based on forming streams from different spectrotemporal modulations of speech. ...
Figure Captions phoneme recognition system based on the Hidden Markov Model -Artificial Neural Network (HMM-ANN) paradigm (Bourlard and Morgan, 1994) trained on clean speech using TIMIT database. ...
doi:10.1121/1.3595744
pmid:21786862
fatcat:axh4yavezjgonbao2qraaot4lu
A speech recognition method based on the sequential multi-layer perceptrons
1996
Neural Networks
A no vel multi-layer perceptrons ( MLP)-based speech recognition method is proposed in this study. ...
In this method, the dynamic time warping capability of hidden Markov models (HMM) is directly combined with the discriminant based learning of MLP for the sake of employing a sequence of MLPs (SMLP) as ...
Spectrograms and related outputs of MLPs for these utterances are shown in part (a) and (b)Mandarin digit /~// was mlsrecognized to /r,/: (a) spectrogram; (b) output values of MLPI corresponding to phonemes ...
doi:10.1016/0893-6080(95)00140-9
fatcat:gcwq3k7j6rbpjf2ql7wjuq67ga
Speaker-independent isolated digit recognition using an AER silicon cochlea
2011
2011 IEEE Biomedical Circuits and Systems Conference (BioCAS)
In fact, it is shown that despite the limited input dynamic range and the un-modelled nonlinearities produced by the hardware cochlea, the discriminative information present in its spike patterns can potentially ...
be sufficient for a task as complex as speaker-independent isolated keyword recognition. ...
95.08%
Low-pass filtered spike trains -SVM
95.58%
Radon spike counts -SVM
93.79%
MFCC+Delta -SVM
96.83%
Auditory spectrogram -SVM
78.73%
MFCC -HMM
99.70% ...
doi:10.1109/biocas.2011.6107779
fatcat:n6rmyt4oyfdk5gbqp2dwimqwv4
« Previous
Showing results 1 — 15 out of 577 results