Filters








5,613 Hits in 6.4 sec

A robust speaker recognition system combining factor analysis techniques

Shaghayegh Reza, Tahereh Emami Azadi, Jahanshah Kabudian, Yaser Shekofteh
2014 2014 21th Iranian Conference on Biomedical Engineering (ICBME)  
function of Gaussian i ( i µ mean vector, i covariance vector and i w weight) and M is the number of mixture models.  ...  In i-vector based speaker recognition systems, acoustic vectors converted to a low-dimensional vector space [6] .  ... 
doi:10.1109/icbme.2014.7043948 fatcat:mujdmdeu3ja2dm6dhribcoijuq

Speaker Recognition from Emotional Speech Using I-vector Approach

MACKOVÁ Lenka, CIŽMÁR Anton
2014 Journal of Electrical and Electronics Engineering  
In recent years the concept of i-vectors become very popular and successful in the field of the speaker verification.  ...  The aim of this experiment was to perform speaker verification using speaker model trained with emotional recordings on i-vector basis.  ...  In EFR, where w is the mean vector and V is the covariance matrix of a set of training i-vectors, normalization of i-vectors and reduction of session variability is defined as ) ( 1 ) ( ) ( 2 1 w w V w  ... 
doaj:df82519533094d779eaac47324619393 fatcat:6shq7ame4rcftdiywbxzzcfrrq

Towards a Small Intra-Speaker Variability Models

I. D. Jokic, S. D. Jokic, V. D. Delic, Z. H. Peric
2014 Elektronika ir Elektrotechnika  
Each element in covariance matrix of a speaker was pondered by appropriate weighting coefficient. Application of this transformation resulted in higher accuracy of automatic speaker recognition.  ...  1 Abstract-Automatic speaker recognizer used in experiments described in this paper uses vectors of melfrequency cepstral coefficients as feature vectors and covariance matrices for speakers modelling.  ...  Covariance matrix as speaker model consists of elements, inter-dimensional covariances of a set of feature vectors modeled.  ... 
doi:10.5755/j01.eee.20.6.7276 fatcat:v665egynfzbrreu635bdq2mq4q

Deep Learning Based Multi-Channel Speaker Recognition in Noisy and Reverberant Environments

Hassan Taherian, Zhong-Qiu Wang, DeLiang Wang
2019 Interspeech 2019  
and evaluate their performance in terms of speaker recognition accuracy for i-vector and x-vector based systems.  ...  We show that rank-1 approximation of a speech covariance matrix based on generalized eigenvalue decomposition leads to the best results for the masking-based MVDR beamformer.  ...  Acknowledgments This research was supported in part by a National Science Foundation grant (ECCS-1808932) and the Ohio Supercomputer Center. The authors would like to thank L. Mošner and J.  ... 
doi:10.21437/interspeech.2019-1428 dblp:conf/interspeech/TaherianWW19 fatcat:repv4x7bbjbhxc4kftiwwgavyq

On Robustness of Unsupervised Domain Adaptation for Speaker Recognition

Pierre-Michel Bousquet, Mickael Rouvier
2019 Interspeech 2019  
Efficiency of the proposed technique is experimentally validated on the recent NIST 2016 and 2018 Speaker Recognition Evaluation datasets.  ...  Details and relevance of different approaches are described and commented, leading to a new robust method that we call feature-Distribution Adaptor.  ...  Domain adaptation methods Figure 1 details the steps of different speaker recognition backend processes with embeddings (i-vector or x-vector) and feature-or model-based domain adaptation.  ... 
doi:10.21437/interspeech.2019-1524 dblp:conf/interspeech/BousquetR19 fatcat:kwicqtl3djhahde7edsb3hjyfi

Discriminatively trained Bayesian speaker comparison of i-vectors

Bengt J. Borgstrom, Alan McCree
2013 2013 IEEE International Conference on Acoustics, Speech and Signal Processing  
This paper presents a framework for fully Bayesian speaker comparison of i-vectors.  ...  By generalizing the train/test paradigm, we derive an analytic expression for the speaker comparison log-likelihood ratio (LLR), as well as solutions for model training and Bayesian scoring.  ...  INTRODUCTION Within the field of speaker recognition, the i-vector has been proposed as an effective method of extracting discriminative speaker and channel information in a manageable low-dimensional  ... 
doi:10.1109/icassp.2013.6639153 dblp:conf/icassp/BorgstromM13 fatcat:3cnkhvyrnna5tfsl46xuty5gkm

Acoustic Factor Analysis for Robust Speaker Verification

T. Hasan, J. H. L. Hansen
2013 IEEE Transactions on Audio, Speech, and Language Processing  
In this study, based on observations of the covariance structure of acoustic features, we propose a factor analysis modeling scheme in the acoustic feature space instead of the super-vector space and derive  ...  Incorporating the proposed method with a state-of-the-art i-vector and Gaussian Probabilistic Linear Discriminant Analysis (PLDA) framework, we perform evaluations on National Institute of Science and  ...  Finally, the transformation was effectively integrated within a standard i-vector-PLDA based speaker recognition system using a probabilistic feature alignment technique.  ... 
doi:10.1109/tasl.2012.2226161 fatcat:zbrbp5tvyfe6jjx2l3m7uo4dw4

Time delay deep neural network-based universal background models for speaker recognition

David Snyder, Daniel Garcia-Romero, Daniel Povey
2015 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU)  
Recently, deep neural networks (DNN) have been incorporated into i-vector-based speaker recognition systems, where they have significantly improved state-of-the-art performance.  ...  In these systems, a DNN is used to collect sufficient statistics for i-vector extraction.  ...  INTRODUCTION Modern speaker recognition systems are based on i-vectors [1] .  ... 
doi:10.1109/asru.2015.7404779 dblp:conf/asru/SnyderGP15 fatcat:uajnu5rrnjcedgtlhstcgxafam

A Generalized Framework for Domain Adaptation of PLDA in Speaker Recognition [article]

Qiongqiong Wang, Koji Okabe, Kong Aik Lee, Takafumi Koshinaka
2020 arXiv   pre-print
This paper proposes a generalized framework for domain adaptation of Probabilistic Linear Discriminant Analysis (PLDA) in speaker recognition.  ...  In particular, we introduce here the two new techniques described below. (1) Correlation-alignment-based interpolation and (2) covariance regularization.  ...  State-of-the-art speaker recognition systems that are composed of an x-vector (or i-vector) speaker embedding frontend followed by a PLDA backend have shown promising performance [11] .  ... 
arXiv:2008.08815v1 fatcat:webto7nqk5gqpdhyq7gdwowu7e

Local spectral variability features for speaker verification

Md Sahidullah, Tomi Kinnunen
2016 Digital signal processing (Print)  
We propose a simple method to capture and characterize this spectral variations through the eigenstructure of the sample covariance matrix.  ...  To sum up, combining local covariance information with the traditional cepstral features holds promise as an additional speaker cue in both text-independent and textdependent recognition.  ...  Acknowledgements The authors would like to thank the anonymous reviewers for their valuable comments and suggestions which have greatly helped in improving the content of this paper.  ... 
doi:10.1016/j.dsp.2015.10.011 fatcat:zrqxp7mdnbccxl5tbrhqzfc5hi

The IBM 2016 Speaker Recognition System [article]

Seyed Omid Sadjadi, Sriram Ganapathy, Jason W. Pelecanos
2016 arXiv   pre-print
recognition, and 3) the use of a deep neural network (DNN) acoustic model with a large number of output units (~10k senones) to compute the frame-level soft alignments required in the i-vector estimation  ...  In this paper we describe the recent advancements made in the IBM i-vector speaker recognition system for conversational speech.  ...  Figure 1 : 1 Block diagram of the IBM speaker recognition system with fMLLR speaker-and channel-adapted features, DNN posterior based i-vectors, and NDA dimensionality reduction.  ... 
arXiv:1602.07291v1 fatcat:44a7vv4dsrhxdig23hf3vxci6q

A Study of Acoustic Features for Emotional Speaker Recognition in I-Vector Representation

Lenka Macková, Anton Čižmár, Jozef Juhár
2015 Acta Electrotechnica et Informatica  
compared in an experimental setup of speaker recognition system, based on i-vector representation.  ...  Recently recognition of emotions became very important in the field of speech and/or speaker recognition.  ...  ACKNOWLEDGMENTS The research presented in this paper was supported partially (50%) by Competence Center for Innovation Knowledge Technology of production systems in industry and services (ITMS project  ... 
doi:10.15546/aeei-2015-0011 fatcat:dgpdefkucngihpk6d3pqozca3e

Improvements on minimum covariance based Spatial correlation Transformation

Tengrong Su, Ji Wu, Zuoying Wang, Jie Hao
2009 2009 IEEE International Conference on Acoustics, Speech and Signal Processing  
In order to take advantage of the correlation information among different acoustic units in speech recognition, a novel approach named Minimum Covariance based Spatial Correlation Transformation was proposed  ...  In this paper, a new algorithm of estimating the transformation matrix and a new strategy of constructing history supervector are proposed.  ...  Obviously, the autocorrelation of frame i x is represented by , the covariance matrix of the SD model mean vectors of state .  ... 
doi:10.1109/icassp.2009.4960650 dblp:conf/icassp/SuWWH09 fatcat:w7gjxhabyzaj3ono47l3xn3b5u

Front-End Factor Analysis for Speaker Verification

Najim Dehak, Patrick J Kenny, Réda Dehak, Pierre Dumouchel, Pierre Ouellet
2011 IEEE Transactions on Audio, Speech, and Language Processing  
We achieved an equal error rate (EER) of 1.12% and MinDCF of 0.0094 using the cosine distance scoring on the male English trials of the core condition of the NIST 2008 Speaker Recognition Evaluation dataset  ...  The first system is a support vector machine-based system that uses the cosine kernel to estimate the similarity between the input data.  ...  ACKNOWLEDGMENT The authors would like to thank the Center for Spoken Language Processing at The Johns Hopkins University for their hos-pitality and the speaker recognition team for their collaboration.  ... 
doi:10.1109/tasl.2010.2064307 fatcat:c47cfasexrdgba3ltcvdwd5c6q

The IBM 2016 Speaker Recognition System

Seyed Omid Sadjadi, Sriram Ganapathy, Jason Pelecanos
2016 Odyssey 2016  
recognition, and 3) the use of a deep neural network (DNN) acoustic model with a large number of output units (∼ 10k senones) to compute the frame-level soft alignments required in the i-vector estimation  ...  In this paper we describe the recent advancements made in the IBM i-vector speaker recognition system for conversational speech.  ...  Figure 1 : 1 Block diagram of the IBM speaker recognition system with fMLLR speaker-and channel-adapted features, DNN posterior based i-vectors, and NDA dimensionality reduction.  ... 
doi:10.21437/odyssey.2016-25 dblp:conf/odyssey/SadjadiGP16 fatcat:uyvcr5gj3rf5vp2k23am6egeyi
« Previous Showing results 1 — 15 out of 5,613 results