A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Deep Discriminant Analysis for i-vector Based Robust Speaker Recognition
[article]
2018
arXiv
pre-print
Linear Discriminant Analysis (LDA) has been used as a standard post-processing procedure in many state-of-the-art speaker recognition tasks. ...
In this paper, we propose a neural network based compensation scheme(termed as deep discriminant analysis, DDA) for i-vector based speaker recognition, which shares the spirit with LDA. ...
Probabilistic Linear Discriminant Analysis i-vectors with Probabilistic Linear Discriminant Analysis (PLDA) back-end obtains the state-of-the-art performance in speaker verification. ...
arXiv:1805.01344v1
fatcat:asptlxvmlvetjbi7v7aac4a4ry
Robust Speaker Verification with Principal Pitch Components
2005
International Journal of Speech Technology
limit of the cepstral analysis. ...
We are presenting a new method that improves the accuracy of text dependent speaker identification systems. ...
For text dependent speaker verification, cepstral features exhibit a discriminative power that is, as of now, unsurpassed by any other feature representation for speech [1] . ...
doi:10.1007/s10772-006-9048-4
fatcat:xhkcyqlk7bdlfg5cep4ac6ybnu
Robust speech analysis by lag-weighted linear prediction
2012
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
This study introduces an approach for linear predictive spectrum analysis based on emphasizing selected time-domain properties in the analyzed signal in combination with a stabilization operation. ...
A stable weighted linear predictive method based on a novel autocorrelation-based weighting scheme is described and its spectral properties are demonstrated. ...
Recently, temporally weighted linear prediction [5] with its many variants has been applied (by the present authors) to text independent speaker verification [4] [6] and large vocabulary continuous ...
doi:10.1109/icassp.2012.6288908
dblp:conf/icassp/PohjalainenA12
fatcat:h7ysjoesgnehtgzv4welrqqiw4
Brief Review of Short Utterance Speaker Verification Systems
2020
Bioscience Biotechnology Research Communications
Due to technological improvements many methods have been proposed for speaker verification. ...
In this paper we primarily emphasis on the survey of different feature extraction methods for textindependent speaker verification. We first briefly review conventional systems to show its progress. ...
of deep features in a tandem method for speaker verification is studied by fU et al.,2014. phone discriminant and speaker discriminant Dnn are combined with conventional acoustic features and applied ...
doi:10.21786/bbrc/13.14/95
fatcat:acnfuhrdgzdinfs37z4rs4rtf4
Boosted binary features for noise-robust speaker verification
2010
2010 IEEE International Conference on Acoustics, Speech and Signal Processing
The standard approach to speaker verification is to extract cepstral features from the speech spectrum and model them by generative or discriminative techniques. ...
The final classifier is a simple linear combination of these selected features. ...
Nelson Morgan and Dr.Francesco Orabona for their comments and advice. ...
doi:10.1109/icassp.2010.5495622
dblp:conf/icassp/RoyMM10
fatcat:duvv34xg2benvnj6jd2l6bu26q
End-to-End attention based text-dependent speaker verification
2016
2016 IEEE Spoken Language Technology Workshop (SLT)
Previously, using the phonetic/speaker discriminative DNNs as feature extractors for speaker verification has shown promising results. ...
In this work we use speaker discriminative CNNs to extract the noise-robust frame-level features. ...
In this paper we use speaker discriminative CNNs to extract noise robust frame-level features. ...
doi:10.1109/slt.2016.7846261
dblp:conf/slt/ZhangCZLG16
fatcat:ar7kuwixsbalbigtu5aghppll4
Recent advances in biometric person authentication
2002
IIEEE International Conference on Acoustics Speech and Signal Processing
While enabling technologies (e.g. audio, video) for biometrics have mostly used separately, ultimately, biometric technologies could find their strongest role as interwined and complementary pieces of ...
) that are provided by Linear Discriminant Analysis (LDA) or Fisher Linear Discriminant (FLD) [21] . ...
discriminants [29] , optimized robust correlation [30] , EGM that employs either multiscale dilation-erosion and combines linear projections at the graph nodes [31] [32] , or morphological signal ...
doi:10.1109/icassp.2002.1004810
fatcat:ychkz3csa5bzjpmurfdjotn7sy
Recent advances in biometric person authentication
2002
IEEE International Conference on Acoustics Speech and Signal Processing
While enabling technologies (e.g. audio, video) for biometrics have mostly used separately, ultimately, biometric technologies could find their strongest role as interwined and complementary pieces of ...
) that are provided by Linear Discriminant Analysis (LDA) or Fisher Linear Discriminant (FLD) [21] . ...
discriminants [29] , optimized robust correlation [30] , EGM that employs either multiscale dilation-erosion and combines linear projections at the graph nodes [31] [32] , or morphological signal ...
doi:10.1109/icassp.2002.5745549
dblp:conf/icassp/DugelayJKKPP02
fatcat:c2vfd4lecbg5hi6xdxtnzay4bi
Multi-Task Learning with High-Order Statistics for X-vector based Text-Independent Speaker Verification
[article]
2019
arXiv
pre-print
The x-vector based deep neural network (DNN) embedding systems have demonstrated effectiveness for text-independent speaker verification. ...
The proposed training strategy aggregates both the supervised and unsupervised learning into one framework to make the speaker embeddings more discriminative and robust. ...
Combined with the probabilistic linear discriminant analysis (PLDA) [2] backend, the i-vector/PLDA framework has become the dominant approach for the last decade. ...
arXiv:1903.12058v2
fatcat:dlreitygybhtrhnjtpyotdjp2m
Deep Speaker Embedding with Long Short Term Centroid Learning for Text-Independent Speaker Verification
2020
Interspeech 2020
Since the long-term speaker embedding centroids are associated with a wide range of training samples, these centroids have the potential to be more robust and discriminative. ...
Recently, speaker verification systems using deep neural networks have shown their effectiveness on large scale datasets. ...
The combination of i-vector and Probabilistic Linear Discriminant Analysis (PLDA) has dominated for over 10 years [2] . ...
doi:10.21437/interspeech.2020-2470
dblp:conf/interspeech/PengGZ20
fatcat:s6sq6ix3zjbe7hhr2xsfcjt5fy
Local spectral variability features for speaker verification
2016
Digital signal processing (Print)
To sum up, combining local covariance information with the traditional cepstral features holds promise as an additional speaker cue in both text-independent and textdependent recognition. ...
Article info:eu-repo/semantics/acceptedVersion © Elsevier B.V CC BY-NC-ND https://creativecommons.org/licenses/by-nc-nd/4.0/ http://dx.Abstract Speaker verification techniques neglect the short-time variation ...
Acknowledgements The authors would like to thank the anonymous reviewers for their valuable comments and suggestions which have greatly helped in improving the content of this paper. ...
doi:10.1016/j.dsp.2015.10.011
fatcat:zrqxp7mdnbccxl5tbrhqzfc5hi
Audio-Visual Biometric Recognition and Presentation Attack Detection: A Comprehensive Survey
[article]
2021
arXiv
pre-print
For many years, acoustic information alone has been a great success in automatic speaker verification applications. ...
The vulnerability of biometrics towards presentation attacks and audio-visual data usage for the detection of such attacks is also a hot topic of research. ...
LBPs features are used for face recognition using a semi-supervised discriminant analysis as an extension to linear discriminant analysis (LDA) [145] . ...
arXiv:2101.09725v1
fatcat:huejyfaeojhzddlckqt5nfivlq
Audio-Visual Biometric Recognition and Presentation Attack Detection: A Comprehensive Survey
2021
IEEE Access
For many years, acoustic information alone has been a great success in automatic speaker verification applications. ...
The vulnerability of biometrics towards presentation attacks and audio-visual data usage for the detection of such attacks is also a hot topic of research. ...
LBPs features are used for face recognition using a semi-supervised discriminant analysis as an extension to linear discriminant analysis (LDA) [145] . ...
doi:10.1109/access.2021.3063031
fatcat:q6emam55frhlzp53t7lxb4qz3e
Speaker Representation Learning using Global Context Guided Channel and Time-Frequency Transformations
[article]
2020
arXiv
pre-print
The proposed modules, together with a popular ResNet based model, are evaluated on the VoxCeleb1 dataset, which is a large scale speaker verification corpus collected in the wild. ...
In this study, we propose the global context guided channel and time-frequency transformations to model the long-range, non-local time-frequency dependencies and channel variances in speaker representations ...
The paradigm has shifted from GMM-UBM and factor analysis based methods like i-vector [7, 8] with a probabilistic linear discriminant (PLDA) back-end [9, 10] to deep neural network based models. ...
arXiv:2009.00768v2
fatcat:qvk2urqeoverllrwxd64o5y6je
Speaker Representation Learning Using Global Context Guided Channel and Time-Frequency Transformations
2020
Interspeech 2020
The proposed modules, together with a popular ResNet based model, are evaluated on the VoxCeleb1 dataset, which is a large scale speaker verification corpus collected in the wild. ...
In this study, we propose the global context guided channel and time-frequency transformations to model the long-range, non-local time-frequency dependencies and channel variances in speaker representations ...
The paradigm has shifted from GMM-UBM and factor analysis based methods like i-vector [6, 7] with a probabilistic linear discriminant (PLDA) back-end [8, 9] to deep neural network based models. ...
doi:10.21437/interspeech.2020-1845
dblp:conf/interspeech/XiaH20
fatcat:24xmwci7tvawrni7ey3ohmtndm
« Previous
Showing results 1 — 15 out of 2,944 results