Filters








3,748 Hits in 3.9 sec

A Symmetric Kernel Partial Least Squares Framework for Speaker Recognition

B. V. Srinivasan, Yuancheng Luo, D. Garcia-Romero, D. N. Zotkin, R. Duraiswami
2013 IEEE Transactions on Audio, Speech, and Language Processing  
In this paper, we propose a kernel partial least squares (kernel PLS, or KPLS) framework for modeling speakers in the i-vectors space.  ...  Recent advances in speaker recognition have utilized their ability to capture speaker and channel variability to develop efficient recognition engines.  ...  Figure 2 : 2 (color) Kernel Partial Least Squares (KPLS) schematic for speaker recognition.  ... 
doi:10.1109/tasl.2013.2253096 fatcat:usebv7u2i5aflm2b6mi5325uma

MKPLS: Manifold Kernel Partial Least Squares for Lipreading and Speaker Identification

Amr Bakry, Ahmed Elgammal
2013 2013 IEEE Conference on Computer Vision and Pattern Recognition  
We then factorize the parameter space using Kernel Partial Least Squares (KPLS) to achieve a low-dimension manifold latent space.  ...  Our approach outperforms for the speaker semi-dependent setting by at least 15% of the baseline, and competes in the other two settings.  ...  We propose to use kernel partial least square (KPLS) on the mapping coefficient space to achieve a supervised low-dimensional latent space for manifold parameterization.  ... 
doi:10.1109/cvpr.2013.94 dblp:conf/cvpr/BakryE13 fatcat:ecbp5gt52jbdxcaus5qxjahreu

A partial least squares framework for speaker recognition

Balaji Vasan Srinivasan, Dmitry N. Zotkin, Ramani Duraiswami
2011 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
We develop a method for modeling the variability associated with each class (speaker) by using partial-least-squares -a latent variable modeling technique, which isolates the most informative subspace  ...  for each speaker.  ...  Motivated by this, we explore here a partial least squares based framework for speaker modeling and recognition in the supervector space.  ... 
doi:10.1109/icassp.2011.5947548 dblp:conf/icassp/SrinivasanZD11 fatcat:dodfty7wjfhv7b4kspkpwkwl4e

Intelligibility detection of pathological speech using asymmetric sparse kernel partial least squares classifier

Dong-Yan Huang, Minghui Dong, Haizhou Li
2014 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Index Terms-Pathological speech, intelligibility of speech, kernel function, sparse kernel partial least squares regression, asymmetric sparse kernel partial least squares classifier  ...  This paper proposes to use asymmetric sparse kernel partial least squares classifier (ASKPLSC) for intelligibility detection of pathological speech.  ...  Therefore, we propose a sparse kernel partial least squares regression in the following for a normalized kernel matrix K.  ... 
doi:10.1109/icassp.2014.6854301 dblp:conf/icassp/HuangDL14 fatcat:55omhytvmbc47ns5jkq5ox2ivq

The UMD-JHU 2011 speaker recognition system

D Garcia-Romero, X Zhou, D Zotkin, B Srinivasan, Y Luo, S Ganapathy, S Thomas, S Nemala, GSVS Sivaram, M Mirbagheri, SH Mallidi, T Janu (+6 others)
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
of reverberation and noise via the use of frequency domain perceptual linear predictor and cortical features; 3) A new discriminative kernel partial least squares (KPLS) framework that complements state-of-the-art  ...  In recent years, there have been significant advances in the field of speaker recognition that has resulted in very robust recognition systems.  ...  Kernel partial least squares: Partial least squares (PLS) is a subspace based learning technique that has been used for dimensionality reduction as well as a regression and is popular due to its ability  ... 
doi:10.1109/icassp.2012.6288852 dblp:conf/icassp/Garcia-RomeroZZSLGTNSMMJRMEHSD12 fatcat:aobi62ffgjclpigqiiyfknaq5q

Manifold-Kernels Comparison in MKPLS for Visual Speech Recognition [article]

Amr Bakry, Ahmed Elgammal
2016 arXiv   pre-print
We apply manifold kernel partial least squares framework to OuluVs and AvLetters databases, and show empirical comparison between all kernels.  ...  This work is intended to evaluate the performance of several manifold kernels for solving the problem of visual speech recognition. We show the theory behind each kernel.  ...  Each frame exposes only the mouth area of the speaker. Framework description Manifold Kernel Partial Least Squares (MKPLS) framework is proposed in [13] .  ... 
arXiv:1601.05861v1 fatcat:5sftjcgxhrezza7vu6se3szlqa

Large-Scale Approximate Kernel Canonical Correlation Analysis [article]

Weiran Wang, Karen Livescu
2016 arXiv   pre-print
time prohibitive for large-scale problems.  ...  Various approximation techniques have been developed for KCCA.  ...  We thank Bo Xie for providing his implementation of the doubly stochastic gradient algorithm for approximate KCCA, and Nati Srebro for helpful discussions.  ... 
arXiv:1511.04773v4 fatcat:qdpld3we4nap7itnyj2rnwa5hy

LSTM Based Cross-corpus and Cross-task Acoustic Emotion Recognition

Heysem Kaya, Dmitrii Fedotov, Ali Yeşilkanat, Oxana Verkholyak, Yang Zhang, Alexey Karpov
2018 Interspeech 2018  
Results indicate the suitability of the proposed method for both time-continuous and utterance level cross-corpus acoustic emotion recognition tasks.  ...  In this work, we first investigate the suitability of Long-Short-Term-Memory (LSTM) models trained with time-and space-continuously annotated affective primitives for cross-corpus acoustic emotion recognition  ...  In line with our recent experience on paralinguistic and multi-modal affective computing [10] , we employ least squares based classifiers such as Kernel Extreme Learning Machines (KELM) and Partial Least  ... 
doi:10.21437/interspeech.2018-2298 dblp:conf/interspeech/KayaFYVZ018 fatcat:ofp3attxybce5ilrgfnognhgym

A novel speech emotion recognition algorithm based on wavelet kernel sparse classifier in stacked deep auto-encoder model

Pengcheng Wei, Yu Zhao
2019 Personal and Ubiquitous Computing  
Therefore, in order to address the abovementioned issues, a novel speech emotion recognition algorithm based on improved stacked kernel sparse deep model is proposed in this paper, which is based on auto-encoder  ...  Finally, a wavelet-kernel sparse SVM classifier is applied to classify the features.  ...  And our proposed recognition rate for wavelet kernel least squares support vector (WKLLSVM) machine has reached 73.19%, fully demonstrating the superiority of our proposed classifier.  ... 
doi:10.1007/s00779-019-01246-9 fatcat:nv6kc6maffdtnhjo2qvgtvp7oi

Sub-Microwatt Analog VLSI Trainable Pattern Classifier

Shantanu Chakrabartty, Gert Cauwenberghs
2007 IEEE Journal of Solid-State Circuits  
A 24-class, 14-input, 720-template classifier trained for speaker identification and fabricated on a 3 mm 3 mm chip in 0.5 m CMOS delivers real-time recognition accuracy on par with floating-point emulation  ...  Subtractive normalization of the outputs by current-mode feedback produces confidence scores which are integrated for category selection.  ...  The method facilitates increased programming speed while achieving a precision of at least 7 bits, which in most cases is sufficient for recognition tasks. Fig. 11 Fig. 11.  ... 
doi:10.1109/jssc.2007.894803 fatcat:kwoig46jbvfbdg5zeokfvjjy6a

Whispered Speech Recognition using Hidden Markov Models and Support Vector Machines

2018 Acta Polytechnica Hungarica  
The experiments are conducted in both Speaker Dependent (SD) and Speaker Independent (SI) fashion for Whi-Spe speech database.  ...  At the same time, HMMbased recognition gave the highest recognition accuracy in SI fashion (87.42%). The results in recognition of neutral speech are given as well.  ...  TR32032, and TR32035, EUREKA project DANSPLAT, "A Platform for the Applications of Speech Technologies on Smartphones for the Languages of the Danube Region", id Е!  ... 
doi:10.12700/aph.15.5.2018.5.2 fatcat:6w2cplw4ijhtland2zspipaeve

Fusing Acoustic Feature Representations for Computational Paralinguistics Tasks

Heysem Kaya, Alexey A. Karpov
2016 Interspeech 2016  
After nonlinear preprocessing, obtained Fisher vectors are kernelized and mapped to target variables by classifiers based on Kernel Extreme Learning Machines and Partial Least Squares regression.  ...  The INTERSPEECH ComParE challenge series has a field-leading role, introducing novel problems with a common benchmark protocol for comparability.  ...  This research is financially supported by the Russian Foundation for Basic Research (project № 16-37-60100).  ... 
doi:10.21437/interspeech.2016-995 dblp:conf/interspeech/KayaK16 fatcat:pt5a3oltxrhftefrjy7uuhzese

Computational Intelligence-Based Biometric Technologies

D. Zhang, Wangmeng Zuo
2007 IEEE Computational Intelligence Magazine  
CI-based methods, including neural network and fuzzy technologies, have also been extensively investigated for biometric matching.  ...  CI-based biometric technologies are powerful when used in the representation and recognition of incomplete biometric data, discriminative feature extraction, biometric matching, and online template updating  ...  Acknowledgments The work is partially supported by the UGC/CRC fund from the HKSAR Government, the central fund from the Hong Kong Polytechnic University and the National Natural Science Foundation of  ... 
doi:10.1109/mci.2007.353418 fatcat:aynahy3ttbesfl3qm3u25gcawq

Svm-Based Lost Packets Concealment For Asr Applications Over Ip

Carmen Peláez-Moreno, Ascensión Gallardo-Antolín, Emilio Parrado-Hernández, Fernando Díaz-de-María
2002 Zenodo  
employed dynamic parameters (delta) produces at least one or two frames delay.  ...  Estimation experiments To gain some insight on the problem, we have compared the mean-square error (MSE) for both the repetition procedure and the SVM regressor and the results, highly favourable to the  ... 
doi:10.5281/zenodo.53611 fatcat:hx7hyxqu5bd6xfvqcyuzl7chuq

Visual Speech Recognition and Utterance Segmentation Based on Mouth Movement

Wai Chee Yau, Hans Weghorn, Dinesh Kant Kumar
2007 9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications (DICTA 2007)  
Segmentation of utterances is important in a visual speech recognition system.  ...  ., human computer interface (HCI) for mobility-impaired users, lip-reading mobile phones, in-vehicle systems, and improvement of speech-based computer control in noisy environments.  ...  The speed of phonation of the speaker might vary for each repetition of the same phone.  ... 
doi:10.1109/dicta.2007.4426769 dblp:conf/dicta/YauWK07 fatcat:vdoxdzikl5b5zmnmqe32dowizq
« Previous Showing results 1 — 15 out of 3,748 results