21,585 Hits in 4.7 sec

A partial least squares framework for speaker recognition

Balaji Vasan Srinivasan, Dmitry N. Zotkin, Ramani Duraiswami
2011 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
We develop a method for modeling the variability associated with each class (speaker) by using partial-least-squares -a latent variable modeling technique, which isolates the most informative subspace  ...  for each speaker.  ...  Motivated by this, we explore here a partial least squares based framework for speaker modeling and recognition in the supervector space.  ... 
doi:10.1109/icassp.2011.5947548 dblp:conf/icassp/SrinivasanZD11 fatcat:dodfty7wjfhv7b4kspkpwkwl4e

A Symmetric Kernel Partial Least Squares Framework for Speaker Recognition

B. V. Srinivasan, Yuancheng Luo, D. Garcia-Romero, D. N. Zotkin, R. Duraiswami
2013 IEEE Transactions on Audio, Speech, and Language Processing  
In this paper, we propose a kernel partial least squares (kernel PLS, or KPLS) framework for modeling speakers in the i-vectors space.  ...  Accomplishing effective speaker recognition requires a good modeling of these non-linearities and can be cast as a machine learning problem.  ...  Conclusions In this paper, we have proposed a kernel partial least squares framework for speaker recognition in the i-vector space.  ... 
doi:10.1109/tasl.2013.2253096 fatcat:usebv7u2i5aflm2b6mi5325uma

MKPLS: Manifold Kernel Partial Least Squares for Lipreading and Speaker Identification

Amr Bakry, Ahmed Elgammal
2013 2013 IEEE Conference on Computer Vision and Pattern Recognition  
We then factorize the parameter space using Kernel Partial Least Squares (KPLS) to achieve a low-dimension manifold latent space.  ...  Visual speech recognition is a challenging problem, due to confusion between visual speech features. The speaker identification problem is usually coupled with speech recognition.  ...  We propose to use kernel partial least square (KPLS) on the mapping coefficient space to achieve a supervised low-dimensional latent space for manifold parameterization.  ... 
doi:10.1109/cvpr.2013.94 dblp:conf/cvpr/BakryE13 fatcat:ecbp5gt52jbdxcaus5qxjahreu

Learning state labels for sparse classification of speech with matrix deconvolution

Antti Hurmalainen, Tuomas Virtanen
2013 2013 IEEE Workshop on Automatic Speech Recognition and Understanding  
Experiments on the 1st CHiME Challenge corpus show improvement in recognition accuracy over labels acquired from original atom sources or previously used least squares regression.  ...  We propose using non-negative matrix deconvolution for learning the labels with algorithms closely matching a framework that separates speech from additive noises.  ...  As one solution, ordinary and partial least squares regression (OLS, PLS) were used to learn the mappings [14] .  ... 
doi:10.1109/asru.2013.6707724 dblp:conf/asru/HurmalainenV13 fatcat:rcgkosvkgfc4vehpcv6sfzev4m

Manifold-Kernels Comparison in MKPLS for Visual Speech Recognition [article]

Amr Bakry, Ahmed Elgammal
2016 arXiv   pre-print
We apply manifold kernel partial least squares framework to OuluVs and AvLetters databases, and show empirical comparison between all kernels.  ...  Speech recognition is a challenging problem. Due to the acoustic limitations, using visual information is essential for improving the recognition accuracy in real-life unconstraint situations.  ...  Each frame exposes only the mouth area of the speaker. Framework description Manifold Kernel Partial Least Squares (MKPLS) framework is proposed in [13] .  ... 
arXiv:1601.05861v1 fatcat:5sftjcgxhrezza7vu6se3szlqa

The UMD-JHU 2011 speaker recognition system

D Garcia-Romero, X Zhou, D Zotkin, B Srinivasan, Y Luo, S Ganapathy, S Thomas, S Nemala, GSVS Sivaram, M Mirbagheri, SH Mallidi, T Janu (+6 others)
2012 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
of reverberation and noise via the use of frequency domain perceptual linear predictor and cortical features; 3) A new discriminative kernel partial least squares (KPLS) framework that complements state-of-the-art  ...  In recent years, there have been significant advances in the field of speaker recognition that has resulted in very robust recognition systems.  ...  Kernel partial least squares: Partial least squares (PLS) is a subspace based learning technique that has been used for dimensionality reduction as well as a regression and is popular due to its ability  ... 
doi:10.1109/icassp.2012.6288852 dblp:conf/icassp/Garcia-RomeroZZSLGTNSMMJRMEHSD12 fatcat:aobi62ffgjclpigqiiyfknaq5q

2020 Index IEEE/ACM Transactions on Audio, Speech, and Language Processing Vol. 28

2020 IEEE/ACM Transactions on Audio Speech and Language Processing  
Gidiotis, A., +, TASLP 2020 3029-3040 A Framework for Adapting DNN Speaker Embedding Across Languages.  ...  ., +, TASLP 2020 1104-1117 A Framework for Adapting DNN Speaker Embedding Across Languages.  ...  T Target tracking Multi-Hypothesis Square-Root Cubature Kalman Particle Filter for Speaker Tracking in Noisy and Reverberant Environments. Zhang, Q., +, TASLP 2020 1183 -1197  ... 
doi:10.1109/taslp.2021.3055391 fatcat:7vmstynfqvaprgz6qy3ekinkt4

LSTM Based Cross-corpus and Cross-task Acoustic Emotion Recognition

Heysem Kaya, Dmitrii Fedotov, Ali Yeşilkanat, Oxana Verkholyak, Yang Zhang, Alexey Karpov
2018 Interspeech 2018  
Acoustic emotion recognition is a popular and central research direction in paralinguistic analysis, due its relation to a wide range of affective states/traits and manifold applications.  ...  Results indicate the suitability of the proposed method for both time-continuous and utterance level cross-corpus acoustic emotion recognition tasks.  ...  In line with our recent experience on paralinguistic and multi-modal affective computing [10] , we employ least squares based classifiers such as Kernel Extreme Learning Machines (KELM) and Partial Least  ... 
doi:10.21437/interspeech.2018-2298 dblp:conf/interspeech/KayaFYVZ018 fatcat:ofp3attxybce5ilrgfnognhgym

Speech Recognition by Integrating Hidden Markov Model Correlated with Artificial Neural Network

A HMM may be introduced as the least hard dynamic Bayesian system.  ...  A hid Markov model (HMM) is a measurable Markov model wherein the framework being verified is notion to be a Markov process with in mystery (shrouded) states.  ...  Adjustment to inward disappointment by methods for Redundant records Coding: Partial demolition of a framework prompts the looking at debasement of execution.  ... 
doi:10.35940/ijitee.b7769.129219 fatcat:rscbi23kjngirnd52ebv7rmbku

Table of Contents

2020 IEEE/ACM Transactions on Audio Speech and Language Processing  
Wang 941 Blockwise Weighted Least Square Active Noise Control for CPU-GPU and M.  ...  Zhao 1170 Multi-Hypothesis Square-Root Cubature Kalman Particle Filter for Speaker Tracking in Noisy and Reverberant Environments .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  ... 
doi:10.1109/taslp.2020.3046148 fatcat:hirdphjf6zeqdjzwnwlwlamtb4

Acquiring Variable Length Speech Bases For Factorisation-Based Noise Robust Speech Recognition

Antti Hurmalainen, Tuomas Virtanen
2013 Zenodo  
Labels were learnt by partial training set factorisation and ordinary least squares regression between activations and utterance state content [12] .  ...  There are 34 speakers, and a 500-utterance training set is provided for each. Speaker identity is assumed known in recognition.  ... 
doi:10.5281/zenodo.43615 fatcat:pzzxg2ozbvexxbb4muwqvgxs2a

Error Reduction Network for DBLSTM-based Voice Conversion [article]

Mingyang Zhang, Berrak Sisman, Sai Sirisha Rallabandi, Haizhou Li, Li Zhao
2018 arXiv   pre-print
So far, many of the deep learning approaches for voice conversion produce good quality speech by using a large amount of training data.  ...  We propose to implement a DBLSTM based average model that is trained with data from many speakers. Then, we propose to perform adaptation with a limited amount of target data.  ...  Berrak Sisman is also funded by SINGA Scholarship under A*STAR Graduate Academy.  ... 
arXiv:1809.09841v1 fatcat:imd6tbqpcrez5e65qihtgdatju

Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition [article]

Jason Pelecanos and Quan Wang and Ignacio Lopez Moreno
2022 arXiv   pre-print
Many neural network speaker recognition systems model each speaker using a fixed-dimensional embedding vector.  ...  We observed significant performance gains for the two techniques.  ...  We thank Niko Brümmer and the reviewers for their feedback. References  ... 
arXiv:2104.01989v3 fatcat:temwvwtxjzacxd5rad7m5jbzp4

Fusing Acoustic Feature Representations for Computational Paralinguistics Tasks

Heysem Kaya, Alexey A. Karpov
2016 Interspeech 2016  
After nonlinear preprocessing, obtained Fisher vectors are kernelized and mapped to target variables by classifiers based on Kernel Extreme Learning Machines and Partial Least Squares regression.  ...  The INTERSPEECH ComParE challenge series has a field-leading role, introducing novel problems with a common benchmark protocol for comparability.  ...  This research is financially supported by the Russian Foundation for Basic Research (project № 16-37-60100).  ... 
doi:10.21437/interspeech.2016-995 dblp:conf/interspeech/KayaK16 fatcat:pt5a3oltxrhftefrjy7uuhzese

Speech recognition in reverberant and noisy environments employing multiple feature extractors and i-vector speaker adaptation

Md Jahangir Alam, Vishwa Gupta, Patrick Kenny, Pierre Dumouchel
2015 EURASIP Journal on Advances in Signal Processing  
The REVERB challenge provides a common framework for the evaluation of feature extraction techniques in the presence of both reverberation and additive background noise.  ...  As in a previous work, we also apply i-vector-based speaker adaptation which was found effective.  ...  Acknowledgements The authors would like to thank the reviewers for their valuable comments which have enabled us to significantly improve the quality of the paper.  ... 
doi:10.1186/s13634-015-0238-6 fatcat:temqyi2ntngnlpqdmjabwbmwhe
« Previous Showing results 1 — 15 out of 21,585 results