A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Learning state-dependent stream weights for multi-codebook HMM speech recognition systems
Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing
FUTURE WORK So f ar w e h a v e o n l yp erf orm ed experi m ents w i t h stream s. ...
G .W i l p o n , \Discrimi n a tive f e a ture selection f o r s p eech recogni t i o n ", Com puter Speech a n d L a n g u a g e, Vol . 7, pp. 2 2 9 -2 4 6 (1993) [ H u a 9 2 ] H u a n g , X . , A l l ...
doi:10.1109/icassp.1994.389316
dblp:conf/icassp/RoginaW94
fatcat:iuhyuhwdlbhijpykzcuwdjelzu
Training combination strategy of multi-stream fused hidden Markov model for audio-visual affect recognition
2006
Proceedings of the 14th annual ACM international conference on Multimedia - MULTIMEDIA '06
Different from the weighting combination scheme, our approach is able to use a variety of learning methods to obtain a robust multi-stream fusion result. ...
To simulate the human ability to assess affects, an automatic affect recognition system should make use of multi-sensor information. ...
Performance comparison in clean audio condition among uni-stream HMM and multi-stream HMM, IHMM and MFHMM, and weighting and training combination schemes. 1 Correlation Error System among Combination rate ...
doi:10.1145/1180639.1180661
dblp:conf/mm/ZengHLFH06
fatcat:2gk5hgkjszht7eknl3twfx4kuy
Multi-stream Confidence Analysis for Audio-Visual Affect Recognition
[chapter]
2005
Lecture Notes in Computer Science
affect recognition. ...
We investigate the use of individual modality confidence measures as a means of estimating weights when combining likelihoods in the audio-visual decision fusion. ...
Lawrence Chen for collecting the valuable data in this paper for audio-visual affect recognition. ...
doi:10.1007/11573548_123
fatcat:se3zhvwbqrcbbega2qhudnk55q
Audio–Visual Affective Expression Recognition Through Multistream Fused HMM
2008
IEEE transactions on multimedia
: information processed by computer system is limited to either face images or the speech signals. ...
Using our Multi-stream Fused Hidden Markov Model (MFHMM), we analyzed coupled audio and visual streams to detect 4 cognitive states (interest, boredom, frustration and puzzlement) and 7 prototypical emotions ...
Multi-stream Fused HMM (MFHMM) For integrating coupled audio and visual features, we propose multi-stream fused HMM (MFHMM) which constructs a new structure linking the multiple component HMMs which is ...
doi:10.1109/tmm.2008.921737
fatcat:5dvcmrvcinfhxcc4yfnkzakoti
Integration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework
2006
Speech Communication
Most of the current state-of-the-art speech recognition systems are based on speech signal parametrizations that crudely model the behavior of the human auditory system. ...
In all experiments involving both speakerdependent and multi-speaker acoustic models, the HMM/BN system outperformed the baseline HMM system trained on acoustic data only. ...
systems using mobile terminals''. ...
doi:10.1016/j.specom.2005.07.003
fatcat:fu4vrlyzmrfpxhr7aizqngi3ey
Discriminative speaker adaptation using articulatory features
2007
Speech Communication
The author would like to thank two anonymous reviewers for their input on an earlier draft of this paper. ...
states of the HMM to be used for the multi-stream system. ...
The problem of combining information from two (synchronous) sources using multi-stream HMMs has been studied in the context of audio-visual speech 9 recognition and multi-band speech recognition, mostly ...
doi:10.1016/j.specom.2007.02.009
fatcat:3tcd3bragja7jl4awlzloilpo4
Hybrid NN/HMM-Based Speech Recognition with a Discriminant Neural Feature Extraction
1997
Neural Information Processing Systems
In this paper, we present a novel hybrid architecture for continuous speech recognition systems. ...
Experimental results show an relative error reduction of about 10% that we achieved on a remarkably good recognition system based on continuous HMMs for the Resource Management 1 OOO-word continuous speech ...
MULTI STREAM SYSTEMS In HMM-based recognition systems the extracted features are often divided into streams that are modeled independently. ...
dblp:conf/nips/WillettR97
fatcat:kt6othuiyzf6jlm4d73c3pa4ku
Hybrid NN/HMM acoustic modeling techniques for distributed speech recognition
2006
Speech Communication
Distributed speech recognition (DSR) where the recognizer is split up into two parts and connected via a transmission channel offers new perspectives for improving the speech recognition performance in ...
Word-based HMMs and phoneme-based HMMs are trained for distributed and non-distributed recognition using either MFCC or RASTA-PLP features. ...
General system architecture for distributed speech recognition As already mentioned, the need for distributed speech recognition (DSR) is evident. ...
doi:10.1016/j.specom.2006.01.007
fatcat:2n2rj7qc4bgntec4447kk6asy4
Statistical parametric speech synthesis with a novel codebook-based excitation model
2014
International Journal of Intelligent Decision Technologies
During the synthesis stage the codebook is searched for a suitable element in each voiced frame and these are concatenated to create the excitation signal, from which the final synthesized speech is created ...
The decomposition is implemented by speech coders. We apply a novel codebook-based speech coding method to model the excitation of speech. ...
Acknowledgements We would like to thank the listeners for participating in the subjective test. We thank the two anonymous reviewers for the helpful comments and suggestions. ...
doi:10.3233/idt-140197
fatcat:rr7tm5etobhhbldsvqwrnffvia
Voice Conversion
[chapter]
2012
Speech Enhancement, Modeling and Recognition- Algorithms and Applications
HMM modeling of speech HMM-based speech synthesis provides a flexible framework for speech synthesis, where all speech features can be modeled simultaneously within the same multi-stream HMM. ...
Linguistic information has not traditionally been considered in the existing VC systems but is of high interest for example in the field of speech recognition. ...
The chapters covers important fields in speech processing such as speech enhancement, noise cancellation, multi resolution spectral analysis, voice conversion, speech recognition and emotion recognition ...
doi:10.5772/37334
fatcat:2hgxvblj4rccvasfudopppuiau
Audiovisual Information Fusion in Human–Computer Interfaces and Intelligent Environments: A Survey
2010
Proceedings of the IEEE
In this paper we describe the fusion strategies and the corresponding models used in audiovisual tasks such as speech recognition, tracking, biometrics, affective state recognition and meeting scene analysis ...
intelligent systems. ...
We sincerely thank the reviewers for their valuable advise which has helped us enhance the content as well as the presentation of the paper. ...
doi:10.1109/jproc.2010.2057231
fatcat:lfzgfmn2hjdq7h6o5txva3oapq
Deep sparse auto-encoder features learning for Arabic text recognition
2021
IEEE Access
We propose a novel hybrid network, combining a Bag-of-Feature (BoF) framework for feature extraction based on a deep Sparse Auto-Encoder (SAE), and Hidden Markov Models (HMMs), for sequence recognition ...
In this work, we introduce a new deep learning based system that recognizes Arabic text contained in images. ...
The last step has been the recognition during which the HMM models were simultaneously decoded according to the multi-stream formalism. ...
doi:10.1109/access.2021.3053618
fatcat:p7jhbokjsjbunceuq4lu7xnmci
Characteristics of the use of coupled hidden Markov models for audio-visual polish speech recognition
2012
Bulletin of the Polish Academy of Sciences: Technical Sciences
This paper focuses on combining audio-visual signals for Polish speech recognition in conditions of the highly disturbed audio speech signal. ...
A significant increase of recognition effectiveness and processing speed were noted during tests - for properly selected CHMM parameters and an adequate codebook size, besides the use of the appropriate ...
The following steps describe the Viterbi algorithm for the two stream coupled HMM used in our audio-visual system. ...
doi:10.2478/v10175-012-0041-6
fatcat:xk45sxtkq5dppdbdifs3pavz4q
An analysis-by-synthesis approach to vocal tract modeling for robust speech recognition
2012
Qatar Foundation Annual Research Forum Proceedings
I enjoyed learning from his wisdom and experience in life as much as I enjoyed learning and deeply understanding the basic issues related to signal processing and speech recognition from him. ...
Together they are an encyclopedia on ideas related to speech recognition and have contributed to this field for more than a decade now. ...
A Knowledge-Based Approach to the Speech Recognition Problem State-of-the-art speech recognition systems use Hidden Markov Models (HMMs) which are composed of states and observations. ...
doi:10.5339/qfarf.2012.aesnp6
fatcat:awqcncfewvaytnkfstmvehwcvm
A multidimensional dynamic time warping algorithm for efficient multimodal fusion of asynchronous data streams
2009
Neurocomputing
Optimally exploiting mutual information during decoding even if the input streams are not synchronous, our algorithm outperforms late and early fusion techniques in a challenging bimodal speech and gesture ...
To overcome the computational complexity of the asynchronous hidden Markov model (AHMM), we present a novel multidimensional dynamic time warping (DTW) algorithm for hybrid fusion of asynchronous data. ...
Examples for multimodal systems causing higher robustness are the combination of speech and gestures or the fusion of speech recognition and lip-reading: by using both modalities the speech recognition ...
doi:10.1016/j.neucom.2009.08.005
fatcat:zwlxz67dzfdqfjmnikvud2bstm
« Previous
Showing results 1 — 15 out of 235 results